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Preface 



These proceedings collect the major part of the lectures given at ENU- 
MATH2003, the European Conference on Numerical Mathematics and Ad- 
vanced Applications, held in Prague, Czech Republic, from 18 August to 22 
August, 2003. 

The importance of numerical and computational mathematics and sci- 
entific computing is permanently growing. There is an increasing number of 
different research areas, where numerical simulation is necessary. Let us men- 
tion fluid dynamics, continuum mechanics, electromagnetism, phase transi- 
tion, cosmology, medicine, economics, finance, etc. The success of applications 
of numerical methods is conditioned by changing its basic instruments and 
looking for new appropriate techniques adapted to new problems as well as 
new computer architectures. 

The ENUMATH conferences were established in order to provide a fo- 
rum for discussion of current topics of numerical mathematics. They seek to 
convene leading experts and young scientists with special emphasis on con- 
tributions from Europe. Recent results and new trends are discussed in the 
analysis of numerical algorithms as well as in their applications to challenging 
scientific and industrial problems. 

The first ENUMATH conference was organized in Paris in 1995, then 
the series continued by the conferences in Heidelberg 1997, Jyvaskyla 1999 
and Ischia Porto 2001. It was a great pleasure and honour for the Czech 
numerical community that it was decided at Ischia Porto to organize the 
ENUMATH2003 in Prague. It was the first time when this conference crossed 
the former Iron Courtain and was organized in a postsocialist country. 

The ENUMATH2003 was organized by the Faculty of Mathematics and 
Physics of the Charles University in cooperation with the Department of 
Mathematics of the Institute of Chemical Technology in Prague. The Charles 
University, the oldest university in the Middle Europe, was founded in 1348. 
In the middle ages, mathematics was studied in Prague at the Artistic Fac- 
ulty, later at the Philosophical Faculty and in the 20th century it belonged to 
the Faculty of Natural Sciences till 1952, when the Faculty of Mathematics 
and Physics was founded. As follows from historical sources, already in the 
15th century the students of the Charles University had the opportunity to 
be trained in “Computational Mathematics”. Kfistan from Prachatice, who 
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is in the Czech history well-known as a Church reformer (he was a friend 
of the famous reformer Jan Hus), or a personal doctor of medicine of the 
Czech and German- Roman King Wenceslas IV, was a professor of astronomy 
and mathematics at the Charles University. He wrote lecture notes with the 
title ’’Algoritmus Prosaicus”, where he describes among other approximate 
methods for the realization of various mathematical operations. The contem- 
porary Czech school of numerical and applied mathematics was born much 
later in the second part of the 20th century. It is connected mainly with 
the names of Ivo Babuska, who can be considered as a father of the Czech 
numerical mathematics, Karel Rektorys, Milan Prager, Emil Vitasek, Milos 
Zlamal, Alexander Zemsek, Ivan Hlavacek, a famous specialist in PDE’s Jin- 
dfich Necas (who influenced a number of Czech numerical analysts) and Jan 
Polasek (who founded the Prague school of CFD). 

These proceedings contain a selection of invited plenary lectures, papers 
presented in minisymposia and works communicated within the sessions. All 
contributions of these proceedings have been reviewed by members of the 
Scientific Committee. At this occasion we want to thank the members of 
the Program Committee (F. Brezzi, M. Feistauer, R. Glowinski, R. Jeltsch, 
Yu. Kuznetsov, J. Periaux, R. Rannacher) and the members of the Scientific 
Committee (O. Axelsson, C. Bernardi, C. Canute, M. Griebel, R. Hoppe, 
G. Kobelkov, M. Kfizek, P. Neittaanmaki, O. Pironneau, A. Quarteroni, 
C. Schwab, E. Siili, W. Wendland) for their scientific support. We are grateful 
to the plenary speakers R. Blaheta, A. Bermudez, T. Gallouet, J. Haslinger, 
R. Hiptmair, T. Hughes, J. Rappaz, A. Russo, A. Tveito and V. Schulz for 
coming to Prague and richly contributing to the success of the conference. 

We are very much obliged to A. Klfc, the head of the Department of 
Mathematics of the Institute of Chemical Technology, and our colleagues J. 
Felcman, J. Segethova and E. Plandorova from the local organizing com- 
mittee for the cooperation in the organization of the conference. Finally, we 
are gratefully indebted to O. Ulrych for the TEX editing and the prepara- 
tion of the camera-ready manuscript of the proceedings. Lastly, we thank all 
participants for coming and animating the meeting. 

We believe that this volume will be an invaluable instrument for obtain- 
ing an overview of the latest and newest results and aspects of numerical 
mathematics and scientific computing and their applications. 



M. Feistauer 
V, Dolejsi 
P. Knobloch 
K. Najzar 
editors 




Table of Contents 



Part I Plenary Lectures 

Numerical Analysis of Finite Element Methods for Eddy 
Current Problems. Applications to Electrode Simulation 

Alfredo Bermudez, Rodolfo Rodriguez, Pilar Salgado 3 

Space Decomposition Preconditioners and Parallel Solvers 

Radim Blaheta 20 

Boundary Conditions for Hyperbolic Equations or Systems 

Thierry Gallouet 39 

Fictitious Domain Methods in Shape Optimization with 
Applications in Free-Boundary Problems 

Jaroslav Haslinger, Tomas Kozubek, Karl Kunisch, Gunter Peichl 56 



Part II Contributed Papers 

Domain Decomposition Method for a Class of Non-Linear 
Elliptic Equation with Arbitrary Growth Nonlinearity and 
Data Measure 

Nour Eddine Alaa, Jean Rodolphe Roche 79 

Variants of Rel 2 ixation Schemes and the Lattice Boltzmann 
Model Relaxation Systems 

Mapundi Kondwani Banda 89 

A Time Semi-Implicit Rel8ixation Scheme for Two-Phase 
Flows in Pipelines 

Michael Baudin, Frederic Goquel, Quang-Huy Tran 102 

Computational Study of Field Scale BTEX Transport and 
Biodegradation in the Subsurface 

Markus Bause 112 




Table of Contents 



viii 

A Two-Level Stabilization Scheme for the Navier- Stokes 
Equations 

Roland Becker, Malte Braack 123 

A Posteriori Error Estimates for Parameter Identification 

Roland Becker, Boris Vexler 131 

On a Phase-Field Model with Advection 

Michal Benes 141 

Fast Evaluation of Eddy Current Integral Operators 

Steffen Borm 151 

Adaptive Computation of Reactive Flows with Local Mesh 
Refinement and Model Adaptation 

Malte Braack, Alexandre Ern 159 

An Alternative to the Least-Squares Mixed Finite Element 
Method for Elliptic Problems 

Jan Brandts, Yanping Chen 169 

Limit Analysis Method in Electrostatics 

Igor A. Brigadnov 176 

Finite Element Mesh Adjusted to Singularities Applied to 
Axisymmetric and Plane Flow 

Pavel Burda, Jaroslav Novotny, Bedfich Sousedik, Jakub Sistek 186 

The Edge Stabilization Method for Finite Elements in CFD 

Erik Burman, Peter Hansbo 196 

Analysis and Computation of Dendritic Growth in Binary 
Alloys Using a Phase-Field Model 

Eric Burman, Marco Picasso, Jacques Rappaz 204 

Discontinuous Galerkin Methods for Timoshenko Beams 

Fatila Celiker, Bernardo Cockburn, Sukru Giizey, Ramdev Kanapady, 

Sew- Chew Soon, Henrik K. Stolarski, Kummar Tamma 221 

Numerical Algorithms for Solving Elliptic— Parabolic 
Problems 

Raimondas Ciegis 232 

Stochastic Relaxation of Variational Integrals with 
Non- attainable Infima 

Dennis D. Cox, Petr Kloucek, Daniel R. Reynolds, Pavel SoUn 239 




Table of Contents 



IX 



A Pressure- Weighted Upwind Scheme in Unstructured 
Finite-Element Grids 

Masoud Darbandi, Kiumars Mazaheri-Body, Shidvash Vakilipour 250 

Discontinuous Galerkin Finite Element Method for the 
Numerical Solution of Viscous Compressible Flows 

Vit Dolejsi 260 

A Finite Volume Scheme on General Meshes for the Steady 
Navier-Stokes Equations in Two Space Dimensions 

Robert Eymard, Raphale Herbin 269 



Existence and Uniqueness of a Weak Solution 
to a Stratigraphic Model 

Robert Eymard, Thierry Gallouet, Veronique Gervais, Roland Masson . 278 



Combined Nonconforming/Mixed- hybrid Finite Element- 
Finite Volume Scheme for Degenerate Parabolic Problems 

Robert Eymard, Danielle Hilhorst, Martin Vohralik 288 

Discrete Maximum Principle for Galerkin Finite Element 
Solutions to Parabolic Problems on Rectangular Meshes 

Istvdn Farago, Robert Horvath, Sergey Korotov 298 

Cubature-Differences Method for Singular Integro-differential 
Equations 

Alexander L Fedotov 308 

Nonconforming Discretization Techniques for Overlapping 
Domain Decompositions 

Bernd Flemisch, Michael Mair, Barbara Wohlmuth 316 

On the Use of Implicit Updates in Minimum Curvature 
Multi-step Quasi-Newton Methods 

John A. Ford, Issam A. Moghrabi 326 

A Boundary Movement Identification Method for a Parabolic 
Partial Differential Equation 

Tom P. Fredman 336 

On Computational Properties of a Posteriori Error Estimates 
Based upon the Method of Duality Error Majorants 

Maxim Frolov, Pekka Neittaanmdki, Sergey Repin 346 

Efficient Algorithm for Local-Bound-Preserving Remapping 
in ALE Methods 

Rao Garimella, Milan Kuchafik, Mikhail Shashkov 358 




X 



Table of Contents 



Mimetic Finite Difference Methods for Diffusion Equations 
on Unstructured Triangular Grid 

Victor Ganzha, Richard Liska, Mikhail Shashkov, Christoph Zenger . . . 368 

On Computational Glaciology: FE-Simulation of Ice Sheet 
Dynamics 

Gunter Godert, Franz-Theo Suttmeier 378 

Nonreflecting Boundary Conditions for Multiple Domain 
Wave Scattering in Unbounded Media 

Marcus J. Grote, Christoph Kirsch, Patrick Meury 391 

On the Choice of the Regularization Parameter in the Case 
of the Approximately Given Noise Level of Data 

Uno Hdmarik, Toomas Raus 400 

Adaptive Discontinuous Galerkin Finite Element Methods 
with Interior Penalty for the Compressible Navier— Stokes 
Equations 

Ralf Hartmann, Paul Houston 410 

On a Novel Technique for Parallel Unstructured Mesh 
Generation in 3D 

Jan Haskovec, Pavel SoUn 420 

Adaptive Finite Element Methods for Turbulent Flow 

Johan Hoffman, Claes Johnson 430 

Numerical Solution of a Nonlinear Evolution Equation 
Describing Amorphous Surface Growth of Thin Films 

Ronald H. W. Hoppe, Eva Nash 440 

Constrained Mountain Pass Algorithm for the Numerical 
Solution of Semilinear Elliptic Problems 

Jin Hordk 449 

Optimal Shape Design of Diesel Intake Ports with 
Evolutionary Algorithm 

Andrds Horvdth, Zoltdn Horvdth 459 

Numerical Simulation of Compressible Fluids with Moving 
Boundaries: An Effective Method with Applications 

Zoltdn Horvdth, Andrds Horvdth 471 

Discontinuous Galerkin Methods for the Time-Harmonic 
Maixwell Equations 

Paul Houston, Ilaria Perugia, Anna Schneeheli, Dominik Schotzau .... 483 




Table of Contents 



XI 



Mixed /ip- Discontinuous Galerkin Finite Element Methods 
for the Stokes Problem in Polygons 

Paul Houston, Dominik Schotzau, Thomas P. Wihler 493 

A Postprocessing of Hopf Bifurcation Points 

Ddsa Janovskd, Vladimir Janovsky 502 

Givens’ Reduction of Quaternion- Valued Matrices to Upper 
Hessenberg Form 

Drahoslava Janovskd, Gerhard Opfer 510 

Model of Compressible Flow and Transport in a Time- 
Dependent Domain 

Pavel Jirdnek, Jim Mary ska, Jan Sembera 521 

Numerical Study of Convection of Multi- Component Fluid 
in Porous Medium 

Olga Kantur, Vyacheslav Tsybulin 531 

Multi-yield Elastoplastic Continuum— Modeling and 
Computations 

Johanna Kienesberger, Jan Valdman 539 

Celebrating Fifty Years of David M. Young’s Successive 
Overrel8Lxation Method 

David R. Kincaid 549 

On the Relational Database Style Parallel Numerical 
Programming 

Bela Kiss, Anna Krebsz 559 

A Dynamical System Describing Evolution of the Implicit 
Surfaces in Incompressible Viscous Liquids 

Petr Kloucek, Michel V. Romerio, Jennifer L. Wightman 569 

Discrete Maximum Principles in Finite Element Modelling 

Sergey Korotov, Michal Kfizek 580 

A Posteriori Error Estimation in Terms of Linear Functionals 
for Boundary Value Problems of Elliptic Type 

Sergey Korotov, Pekka Neittaanmdki, Sergey Repin 587 

Numerical Solution of Flow in Backward Facing Step 

Karel Kozel, Petr Louda, Petr Svdcek 596 

Periodicity Properties of Solutions to a Hysteresis Model in 
Micromagnetics 

Martin Kruzik 605 




Table of Contents 



xii 

Mixed Finite Element Method on Polygonal and Polyhedral 
Meshes 

Yuri Kuznetsov, Sergey Repin 615 

Semi-discrete Schemes for Hamilton- Jacobi Equations on 
Unstructured Grids 

Doron Levy, Suhas Nayak 623 

Numerical Simulation of Dislocation Dynamics 

Vojtech Mindrik, Jan Kratochvil, Karol Mikula, Michal Benes 631 

Implicit FEM-FCT algorithm for compressible flows 

Matthias Moller, Dmitri Kuzmin, Stefan Turek 641 

A Singular Limit Method for the Stefan Problems 

Hideki Murakawa, Tatsuyuki Nakaki 651 

Higher-Order Split-Step Schemes for the Generalized 
Nonlinear Schrodinger Equation 

Gulcin M. Muslu, Husnu A. Erbay 658 

Numerical Methods and Simulation Techniques for Flow 
with Shear and Pressure Dependent Viscosity 

Ahderrahim Ouazzi, Stefan Turek 668 

Piecewise Polynomial Approximations for Linear Volterra 
Integro-Differential Equations with Nonsmooth Kernels 

Arvet Pedas 677 

On a Discontinuous Galerkin Method for Radiation-Diffusion 
Problems 

Ilaria Perugia, Dominik Schotzau, James Warsa 687 

Modeling of Multi-Phase Flows with a Level-Set Method 

Sander P. van der Pijl, A. Segal, C. Vuik 698 

Numerical Modeling of Bypass Flow 

Vladimir Prokop, Karel Kozel 708 

A Posteriori Estimation of Dimension Reduction Errors 

Sergey Repin, Stefan Sauter, Anton Smolianski 716 

Analysis of a Multi-Numerics/Multi-Physics Problem 

Beatrice Riviere 726 

The Discontinuous Galerkin Method for Singularly Perturbed 
Problems 

Hans-Gorg Roos, Helena Zarin 736 




Table of Contents 



xiii 

A Finite- Volume Mass- and Vorticity- Conserving Shallow- 
Water Model using Pent a- /Hexagonal Grids 

William Sawyer, Rolf Jeltsch 746 

Application of Parallel Computing Techniques for Problems 
of Degenerated Diffusion 

Milan Senkyf, Jim Mikyska, Michal Benes 756 

The Finite Element Analysis of an Elliptic Problem with a 
Nonlinear Newton Boundary Condition 

Veronika Sobotikovd 766 

Automatic Goal-Oriented /ip- Adaptivity Without Error 
Estimates 

Pavel Solm, Leszek Demkowicz 775 

A Compression Method for the Helmholtz Equation 

Mir jam Stolper, Sergej Rjasanow 786 

Application of a Stabilized FEM to Problems of Aeroelasticity 

Petr Svdcek, Miloslav Feistauer 796 

A Numerical Approach to the Dynamical Behavior of 
Initiated Pulses in Some Nonlinear Diffusion Equations 

Kenji Tomoeda 806 

Fully Two-dimensional HLLEC Riemann Solver and 
Associated Difference Schemes 

Pavel Vdchal, Richard Liska, Burton Wendroff 815 

Deflation Accelerated Parallel Preconditioned Conjugate 
Gradient Method in Finite Element Problems 

Fred J. Vermolen, Kees Vuik, Guus Segal 825 

Advantages of Binomial Checkpointing for Memory-reduced 
Adjoint Calculations 

Andrea Walther, Andreas Griewank 834 

An Efficient Multigrid FEM Solution Technique for 
Incompressible Flow with Moving Rigid Bodies 

Decheng Wan, Stefan Turek, Liudmila S. Rivkind 844 

Higher-Order FEM for a System of Nonlinear Parabolic 
PDE’s in 2D with A-Posteriori Error Estimates 

Martin Zitka, Karel Segeth, Pavel Solm 854 




Part I 



Plenary Lectures 




Numerical Analysis of Finite Element Methods 
for Eddy Current Problems. Applications to 
Electrode Simulation 
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Summary. The objective of this work is to introduce and numerically solve a 3D- 
mathematical model for steady thermoelectrical behavior of electrodes in a metal- 
lurgical electric furnace. The mathematical model couples the time-harmonic eddy 
current model with the heat transfer equations in a bounded 3D-domain. An impor- 
tant part of the paper deals with the analysis and numerical solution of the eddy 
current model in a bounded domain. 



1 Introduction 

Silicon is produced industrially by reduction of silicon dioxide with carbon by 
a reaction which can be written in a simple way as follows: 

Si 02 + 2C = Si + 2C0. 

This reaction takes place in submerged arc furnaces which use three-phase 
alternating current. A simple sketch of the furnace can be seen in Figure 1. It 
consists of a cylindrical pot containing charge materials and three electrodes 
disposed conforming an equilateral triangle. 

Electrodes are the main components of reduction furnaces and their pur- 
pose is to conduct the electric current which enters the electrode through the 
“contact clamps” (see Figure 1). The electric current goes down crossing the 
column length comprised between the contact clamps and the lower end of the 
column generating heat by Joule effect. At the tip of the electrode an electric 
arc is produced, reaching temperatures of about 2500 °C which are needed for 
the reduction chemical reactions to take place. 

Classical electrodes extensively used in industry include pure graphite^ pre- 
haked and S0derherg electrodes. The latter are the most used in ferro-silicon 
industry and they are composed by paste consisting of a carbon aggregate 
and a tar binding which are fed into a steel casing; the casing have steel 
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fins attached to its inner part, which are placed radially in the cylinder. The 
great amount of heat generated by Joule effect is partially employed to bake 
the paste; this is a crucial process during which the initially soft /liquid non- 
conductive paste at the top of the electrode becomes a solid conductor. The 
advantages of Spderberg electrodes with respect to pure graphite or prebaked 
electrodes are that they are built in larger sizes and cost less. However, as the 
electrode is consumed, it has to be slipped and the steel casing moves with the 
carbon body so it melts and pollutes silicon. This is why they cannot be used 
to obtain silicon metal or silicon with metallurgical quality, which is used as 
alloying of other metals as aluminum. Thus prebaked electrodes have been for 
many years the only alternative for commercial silicon metal production. 

In the early nineties, the Spanish company Ferro atlantica S.L. built a new 
compound electrode named ELSA ([14]) which serves for the production of 
silicon metal. It seems to be the solution for all silicon furnaces because its 
cost can be up to one third the price of a prebaked electrode. 

ELSA electrode consists of a central column of baked carbonaceous mate- 
rial, graphite or similar, surrounded by a Spderberg-like paste (see Figure 2). 
There is a steel casing without fins that contains the paste until it is baked 
at the contact clamps zone. Two different slipping systems exist, one for the 
casing and another one for the central column; the combination of both sys- 
tems is necessary so as to slip the casing as little as possible and also to carry 
out the correct extrusion of the carbon electrode. Then, unlike in the case of 
Spderberg electrodes, the casing is not consumed and it is possible to produce 
silicon with metallurgical quality. The result is that the furnace operation is 
similar to that of prebaked electrodes, but the compound electrode is less ex- 
pensive. The disadvantage is that slipping velocity is not free as in prebaked 
electrodes, because the paste has to be baked before leaving the casing, so it is 
necessary a minimum period of time between slippages. Thus, baking of paste 
is a crucial point in the working of this type of electrodes. 
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Fig. 2. Sketch of ELSA electrode 



In general, the design and control parameters of electrodes are very complex 
and numerical simulation plays an important role at this point. Modeling the 
involved phenomena in a computer allows us to analyze the influence of chang- 
ing a parameter without the need of expensive and difficult tests. Thus, during 
the last 20 years, an important number of mathematical models and computer 
programs have been developed in order to simulate the thermoelectrical be- 
havior of classical electrodes (see for instance [15, 17, 18]). In particular, the 
mathematical models based on cylindrical symmetry have been the most ex- 
tensively used. However, ELSA electrode works in a different manner from the 
classical electrodes. While classical electrode has only a constitutive material, 
compound electrode combines a good electric current conductor as graphite 
with a paste which becomes a good conductor only at high temperatures. Not 
only the core of graphite is important in the movement of the column but also 
in the distribution of current inside the electrode. Moreover, unlike Spderbeg 
electrodes, the non existence of fins gives a geometrical axisymmetry (see Fig- 
ure 3). 

This is why we first developed a finite element method based on cylindrical 
symmetry to compute the electric current and temperature distribution in 
a radial section of the electrode [5] . While the axisymmetric model has given 
valuable information on important electrode parameters, the assumption of 
cylindrical symmetry makes necessary to neglect the following facts: 

- The electromagnetic effect caused on one electrode by the two others, that 
is the so called “proximity effect”. This arises because the magnetic field 
generated by each electrode induces eddy currents in the two others. 

- Thermal, boundary conditions are not axisymmetric. Indeed, the tempera- 
ture of the air around the electrode is greater on the surfaces oriented toward 
the furnace center. 
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Fig. 3. Cross section of ELSA and S0derberg electrode 



- The current entrance in the electrode through the contact clamps is not ax- 
isymmetric. The current is transferred to the contact clamps through copper 
bus tubes which in its turn are connected to three transformers with differ- 
ent phases. Then, in each electrode, half of the clamps receive current from 
one transformer while the other ones are connected to a second transformer. 

These points can only be considered by using a pure three-dimensional model. 
Moreover, 3D-models are always needed to simulate Spderberg electrodes be- 
cause the presence of fins breaks cylindrical symmetry (see Figure 3). Thus, we 
have developed a three dimensional thermoelectrical model which is enough 
general to model any kind of electrodes and even the complete furnace. In this 
paper, we describe two different mathematical models and analyze them from 
mathematical and numerical points of view. 

The electromagnetic problem is obtained from the time-harmonic Maxwell 
equations assuming the frequency is low enough as to neglect the term in- 
volving the displacement current in Ampere’s law. This is the so-called eddy 
current model. Because of many interesting applications in electrical engineer- 
ing, numerical simulation of eddy current problems have led to a great number 
of publications in recent years (see for instance [1, 2, 3, 10, 11, 12, 13]). We 
notice that Maxwell equations concern the whole space, but we are interested 
in solving the problem in a bounded domain, so we have to define suitable 
boundary conditions and this need represents the main difficulty to study the 
problem in a bounded domain. Thus, we start introducing the eddy current 
problem in the whole furnace, including the electrodes and the air around, and 
defining natural and essential boundary conditions. In a second step we change 
this model, by introducing realistic boundary conditions, to compute the elec- 
tromagnetic fields in only one electrode. Finally, we couple the electromagnetic 
model with a thermal one. Coupling between Maxwell and heat transfer equa- 
tions is due to Joule effect which is the source term in the heat equation, and 
to the fact that thermoelectrical parameters depend on temperature. 

The outline of the paper is as follows: In Section 2 we deal with the math- 
ematical and numerical analysis of the electromagnetic problem in a bounded 
3D domain which includes conductors and dielectrics. We introduce a weak 
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formulation which involves the magnetic field in the conductor domain and 
a scalar magnetic potential in the dielectric one. This hybrid formulation is 
discretized by using Nedelec edge finite elements for the magnetic field and 
standard piecewise linear continuous elements for the magnetic potential. The 
resulting discrete problems are studied and error estimates are obtained under 
mild smoothness assumptions on the solution. Section 3 is devoted to propose 
and analyze a finite element method to solve the electromagnetic problem only 
in one electrode. We introduce a weak formulation of the problem in terms of 
the magnetic field and deal with boundary conditions directly related with 
the intensities which enter the domain. Lagrange multipliers are introduced 
to impose these ‘‘non standard” boundary conditions and the resulting mixed 
formulations are studied following classical techniques. In Section 4, we couple 
the electromagnetic problem with the thermal one and give a result concern- 
ing existence of solution. We end the paper by reporting, in Section 5, some 
numerical results obtained for ELSA and Spderberg electrodes. 



2 The electromagnetic problem in the whole furnace 

In order to consider all the facts which are neglected in the axisymmetric 
models, we start proposing a model to solve the eddy current problem in 
a bounded domain like the one presented in Figure 4, which includes not 
only conductors (the electrodes and wires supplying the electric current), but 
dielectrics as well (the air). 

2.1 The eddy current problem 

Eddy currents are usually modeled by the low-frequency harmonic Maxwell 
equations. Assuming alternating electric current of angular frequency a;, they 
are 



curlH = J, 


(1) 


iiopH -h curlE 0, 


( 2 ) 


divB 0, 


( 3 ) 


divD = p. 


( 4 ) 


pH, D = eE, J = crE, 


( 5 ) 



where H, J, B, E, and D are the complex amplitudes associated with the 
magnetic field, the current density, the magnetic induction, the electric field 
and the electric displacement, respectively; p is the electric charge density, p 
is the magnetic permeability, e is the electric permittivity and a is the electric 
conductivity. 
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Fig. 4. Sketch of the furnace Fig. 5. Sketch of a general domain 

We will solve these equations in a bounded domain i?, which consists of 
two parts, and occupied by conductors and dielectrics, respectively 
(see Figure 5). The boundary of the domain i? also splits into two parts: 
dQ^ndQ and dQ^^ndQ. Finally, we denote by F^ := the 

interface between dielectric and conductors. The boundary conditions added 
to the eddy current model are 

E X n = g on r^, 

H X n = f on 



( 6 ) 

(7) 
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with g and f being given tangential vector fields (i.e., satisfying g • n = 0 on 
and f • n == 0 on and n an outward unit normal vector to 
We remark that (6) is the natural condition for the conducting part of the 
boundary, while (7) is imposed on the dielectric part and allows taking into 
account all of the electromagnetic effects outside the domain. 

We will introduce and analyze a finite element method to solve this problem 
in domains of general topology. To attain this goal, we will consider a formula- 
tion introduced by Bossavit and Verite [13], which involves the magnetic field 
in the conductor domain and a scalar magnetic potential in the dielectric one. 
Then, as a first step, we start analyzing a weak formulation of the problem in 
terms of the magnetic field. 

2.2 A magnetic field formulation of the eddy current problem 

Let us assume that Q is simply connected, with a Lipschitz- continuous con- 
nected boundary. The subdomains and are also assumed to have 
Lipschitz- continuous boundaries, although not necessarily connected. Finally, 
the boundaries of 7^, and are also assumed to have Lipschitz-continuous 
boundaries. 

Let us consider the following closed subspaces of H(curl, 17), 

V == {G G H(curl, 17) : curl G = 0 in l7j^} , 
v° = {g € V : G X n = 0 in , 

where denotes the dual space of which, in its turn, is 

the space of functions defined on 7^ that extended by 0 on d 17 \ 7^ belong to 
H^/^(^l7)^. We assume that /i,e, cr G L^(17), and that there exist constants, 
/i, e, and a, such that 

/i(x) > /i > 0, e(x) > e > 0, a.e. in 17, 

<^(x) > £ > 0, a.e. in 17^., a(x) =0 in l7j^. 

We suppose that the boundary data g satisfies g x n G HJo^(T^)^. On the 
other hand, concerning the boundary data f, we suppose there exists a field 
Hf G V such that 

Hf X n = f in 

Then, multiplying the equation (2) by a test function of the space V^, 

integrating in 17, and using Green’s formula, (1), (6), and (7), we obtain the 

following weak formulation in terms of the magnetic field H. 

Problem MP.- To find H G V such that 

H X n = f in (8) 

iw / uH-G+ / -curIH- curlG = (g X n,G X n)r VGgV°.(9) 

Jq ^ ^ 
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Theorem 1 . If there exists Hf G V such that Hf X n = f in then 

problem MP has a unique solution. 

Once the magnetic field H is known, the current density J and the electric 
field E can be computed in conductors, namely, J = curlH and E = • 

These are the magnitudes actually needed in most applications and satisfy the 
Maxwell equations (l)-(5) and boundary conditions (6)-(7) (see Theorem 3.2 
in [7]). 



2.3 Introducing a magnetic potential 

In this section we show how problem MP can be transformed by replacing the 
magnetic field in the dielectric domain by a (scalar) magnetic potential. 

Let = Uj=o with being the union of all the connected compo- 
nents of such that i? \ is simply connected, and j = 1, . . . , J, the 
remaining connected components of 12^ (see Figure 5). 

We assume that for each j = 1,...,J, there exists an open “cut” 

surface Ej C such that dEj C and := \ Uj=o pseudo- 

Lipschitz and simply connected (see Figure 5). We also assume that each of 
these surfaces Ej is connected, and Ej(^Ek = 0 for j ^ k (see, for instance, 

W)- 

For any function ^ G H^(i7j3), we denote by the jump of ^ through 

Ej. The gradient of ^ in D'(i7j^) can be extended to and will be 

denoted by grad 

Let O be the linear space of defined by 

6> = = constant, j = 1, . . . , J j . 

Then, for ^ G H^(i7j^), we have that gfadlZ^ G H(curl, Qjf) if and only if 
G 0, in which case curl ( grad 1^) — 0 (see Lemma 3.11 in [4]). Then, for 
all G G V there exist ^ G 0 such that = gfadlZ^. 

We introduce the following notation: for G<^ G L^(f?c)^ ^ L^(l7j^)^, 

we denote by (G^^jGj^) the field G G L^(i7)^ defined a.e. by 



G(x) 



G^(x) if X G 
Gj,(x) if X G i?D- 



Let us denote by W the linear space given by 

W |(G, G H(curl, x {O/C) : (G| grad G H(curl, i?)| . 
Similarly, we define the closed subspace of W 

W° := {(G, !^) e W : grad x n = 0 in . 
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By using this notation we can define the following problem: 
Problem HP.- To find (H, G W such that 



grad# X n = f in 



iuj [ fiH-G+ [ - 



curl H • curl G 



- lUJ 



j 

J Q. 



= (g X n,G X n)^ 



ji grad • grad T = 

V(G,!^)gW^ 



This is the well known magnetic field/magnetic potential hybrid formu- 
lation of the eddy current problem introduced by Bossavit and Verite [13]. 
The main advantage with respect to formulation (8)-(9) lies in the fact that 
a vector field is replaced by a scalar one in the dielectric domain. 

Theorem 2.JJnder the assumptions of Theorem 1, problem HP has a unique 
solution (H,^)^ with (Hjgfad^) being the unique solution of problem MP. 



2.4 Numerical solution 

In this section we first introduce a discretization of problem MP and then we 
obtain a discrete version of problem HP equivalent to the previous one. 

Let us assume 17, and are Lipschitz polyhedra, and consider a family 
of regular tetrahedral meshes {Th} of i? such that, for every mesh 7^, each 
element K e Th is contained either in or in {h stands as usual for the 
corresponding mesh-size). 

The magnetic field is discretized by using Nedelec edge finite elements (see 
[19]). In particular, H is approximated in each tetrahedron K by a polynomial 
vector field in the space 

AT{K) := {Gft G Vi{Kf ; Gh(x) = a x x + b, a, b G C^, x G if} . 

Then, fields in H(curl, f?) will be approximated in the following finite di- 
mensional space: 

AT^f?) {G^ € H(curl, f?) : G^|^ G Af{K) \/K G %} . 

In order to use these elements to discretize problem MP, we have to use an 
approximation fj of the boundary data f such that a discrete version of equation 
(8) can hold true. To attain this goal, we will use the two-dimensional Nedelec 
interpolant of n x f on the triangular mesh induced by Th on the polyhedral 
surface i^. This interpolant and several of its properties are described in detail 
in [7], 

Then, in order to discretize problem MP, we introduce the following finite- 
dimensional spaces. 
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Vh {G/i e Afh{^) ■ curlG/i = 0 in i?^} , 
:= {G^ e Vh : G^ X n = 0 on TJ , 

and obtain the following discrete magnetic problem: 
Problem DMP.- Find Hh € Vh such that 



X n = fj on F^, 

iuj [ l-dtlh ’ Gh + / ~ 
Jq Jq^ ^ 



curl • curl Gh 



= gxn-G/^xn 

VG^eVt 



Theorem 3. Let us assume that the solution H of problem MP satisfies 
H|o G H’’(curl, i?c) and H|^ G with r G (^,1]. Then, fj is well 

defined by the 2D Nedelec interpolant of n x f, problem DMP has a unique 
solution and the following error estimate holds: 



1^ H/i||H(curl,i7) 



< Ch'^ 



IIHI 






+ IIHI 






However, problem DMP is actually just a “theoretical” method in that its 
solution requires to impose somehow the curl-free condition in the definition 
of Vh to ti’ial and test functions. Then, we will handle this curl-free condition 
by introducing a discrete multiple- valued magnetic potential in the dielectric 
domain. 

We assume that the cut surfaces Uj are polyhedral and that the meshes are 
compatible with them, in the sense that each Ej is union of faces of tetrahedra 
K e Th, for each mesh Th. Therefore, := {K £ Th : K C 17^} can also 
be seen as a mesh of 

In order to introduce an approximation of the space O, let us denote 
:= G H1(4) : h\K e Vi(K) . 

Then, we consider the family of finite dimensional subspaces of G given by 



Gh {Th G = constant, j 1, . . . , J}. 

The following lemma shows that each curl- free vector field in M h{^ if) ad- 
mits a multiple- valued potential in Gh (see [7]). 

Lemma 1. Let Gh G Then Gh G Afhi^ifj curlG/^ = 0 in 

if and only if there exists Th G Gh such that Gh = gFadTh in f2j^. Such Th is 
unique up to an additive constant. 

Let us introduce the following families of finite- dimensional approximations 
of W and respectively: 
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Wh ■■= [{Gh,'^h) e X (6>ft/C) : (Gh\gfkd^h) e H(curl,r?)} , 

Wfc := ^{Gh,^h) G W/i : gfad^h x n = 0 on 7^ j . 

Thus, we define the following discrete problem which is equivalent to problem 

BMP: 

Problem DHP.- To find € Wh such that 

grad X n = fj on 

\io I fiHh-Gh + I IcurlH/j- curlG/j+iw [ /igfad^/i- gfad^/i 

Jn^ ^ Jn^ 

= [ gxn-GhXn V{Gh,h) eWl 

Theorem 4. Let us assume that the solution (H, of problem HP satisfies 
H G H’^(curl, i?^) and gfad^ G with r G (^,1]- Then, problem 

DHP is well posed, it has a unique solution CL^d 

l|H - Hh||H(curl,X2J + II gr~ad^- gfad^;,||L2(^2^)3 
— [l|H|lH’-(curi,r?(,) + II Srad^||jj,.(^^p . 

Effective procedures to solve numerically the problem DMP are described in 
[7]. In particular, numerical techniques to impose the following constraints are 
studied: 

1. (G/i| gfkdTh) G H(curl, f?), which arise in the definition of Wh- 

2. = constant, which arise in the definition of 0h- 

3. The boundary condition gfad^/i x n = f j on 

The first constraint is imposed by eliminating the c^grees of freedom of Qh 
associated with the edges £ G 1/ in terms of those of T>h corresponding to the 
vertices of the mesh on this interface. 

The second constraint is handled by distinguishing the degrees of freedom 
of Th on one side of the surface Ej from those on the other side, and by 
eliminating ones of them in terms of the others and the current intensities 
through each conductor 

The third constraint is imposed by means of a Lagrange multiplier, in- 
creasing in this way the number of unknowns but with the advantage that the 
computer implementation is quite straightforward. 

We have developed a MATLAB code which implements the method de- 
scribed above. To validate the computer code and to test the performance and 
convergence properties of the method, we have solved a problem with known 
analytical solution (see [7] for further details). 
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3 The electromagnetic problem in one electrode 

3.1 Statement of the problem 

The model described in the previous section presents some drawbacks. First, it 
is highly complex and its numerical solution takes a lot of time. On the other 
hand, it is difficult to obtain the boundary data f from realistic data such as 
intensities or potentials, which usually are the only data we know. Then, we are 
going to propose an alternative approach which consists in solving the eddy 
current problem in one electrode which is a particular bounded conducting 
domain. We are going to analyze a weak formulation of this problem in terms 
of the magnetic field, considering realistic boundary conditions from the point 
of view of applications. In particular, following Bossavit [12], we will consider 
boundary conditions directly related with the input current intensities which 
enter the electrode. We will impose these boundary conditions by means of 
Lagrange multipliers and study the resulting mixed formulations. 

Since we only consider the conducting domain, we will get an important 
saving in computer time when compared with the model of the whole furnace, 
and we will still be able to consider some important effects which are not taken 
into account by the axisymmetric models, although not the proximity effect. 

We consider a bounded conducting domain i? having a Lipschitz-conti- 
nuous and connected boundary. However, it is not necessary that i? be simply 
connected. Let dQ be the boundary of the domain i? which splits into two 
parts: df? = Fj. The surface F^ corresponds to the tip of the electrode 
where the electric arc arises. In its turn, the rest of the electrode boundary 
splits as follows: 

u r/ u • • • u fy , 

where 7]’^, n = 1, . . . , A^, are the parts of the boundary connected to the wires 
supplying electric current to the electrode, and F^^ = Fj\{F^^ U • • • U Ff^) is the 
remaining part (see Figure 6). We also assume F^^ nF^ — 0 and F^^ fi F^^ — 0, 
m, n == 1, . . . , 77 , m ^ n. 

Our goal is to solve the eddy current equations (l)-(5) subject to the 
following boundary conditions: 



E X n = 0 


on Tj,, 




(10) 


curl H • n = /n , 


n = 1, . . 


.,N, 


(11) 


E X n = 0 


on r-, 


n=l,...,N, 


(12) 


curl H • n = 0 


on r°, 




(13) 


o 

11 


on dQ^ 




(14) 




j 



where the only data /n, n = 1, . . . , A", are the current intensities through each 
wire. 
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Fig. 6. Example of domain 



Condition (10) is the natural one to model the free current exit on the 
electrode tip. Conditions (11) and (13) take into account the input intensities 
and the fact that there is no current flow through , respectively. Conditions 
(12) and (14) have been proposed by Bossavit [12] in a more general setting. 
They will appear as natural boundary conditions of the weak formulation of our 
problem. The former implies the assumption that the electric current is normal 
to the surface on the current entrance, whereas the latter means that the 
magnetic field is tangential to the conductor surface. Of course, condition (14) 
is not always fulfilled, but it is a good approximation in our model problem. 

Next, we analyze a weak formulation of this problem in terms of the mag- 
netic field and propose a finite element method for its numerical solution. 

3.2 Analysis of the weak formulation of the problem 

To obtain a weak formulation of the eddy current problem (l)-(5) with bound- 
ary conditions (10)-(14) in terms of the magnetic field, we notice that the 
boundary condition (14) implies that the tangential component of E on the 
boundary of i? is a gradient. In particular, we obtain that E x n = — x n 
on df2 for some scalar function (j) with (f) = 0 on because of (10). 
Moreover, because of (12), must be constant. Then, multiplying the 

equation (2) by a test function G such that curlG • n = 0 on Fj^ and 
curlG • n = 0, n = 1,...,A, using Green’s formula, and taking into 

account that E = - curl H, we obtain 

iw [ fiU-G + [ 1 curl H • curlG = 0. 

Jf2 Jq ^ 

Let X := H(curl, Q) and a\ X x X — > C be the sesquilinear continuous 
and elliptic form defined by 
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a(H,G) 




— curl H • curl G. 

<7 



Let £ be the following closed subspace of hJq^(/]): 

£ := G = constant, n = 1, . . . , . 

Given I — (/i, . . . , /iv) G C^, let us consider the closed linear manifold of 



W(I) := < G G AT : ( curl G • n, ly)^ 



N 

-u. 



In^ VZ/ G £ 



and its associated subspace 

W(0) — |g G Af : ( curl G • n, u) p == 0 Vi/ G £| . 

We introduce the following problem: 

Problem PI.- For any I G find H G W(I) such that 



a(H,G) = 0 VGgW(O). 

Theorem 5. Given I G , problem PI has a unique solution H. 

To avoid dealing with functions that satisfy the constraints involved in 
W(I) and W(0), we consider a mixed formulation of the problem. It consists in 
handling the boundary conditions (11) and (13) in a weak sense by introducing 
a Lagrange multiplier defined on 

Let b be the sesquilinear form defined in Af x £ by 



6(G, ly) ( curl Q • n,iy)p. 



The mixed problem associated with problem PI is the following: 
Problem MPl.- Given I G , findUeX and A £ £ such that 



a(H, G) + 6(G,A) = 0 

N 

b{u,u) = T 

n=i Jr- 



VG G AT, 
Vi/g£. 



Theorem 6. Given I G /et H G A' be the solution of problem PI. Then, 
there exists a unique A G £ such that (H, A) is the only solution of problem 
MPI. Furthermore, the following estimate holds: 
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The proof is based on the classical Babuska-Brezzi theory. In particular we 
prove the inf-sup condition for the bilinear form b by using results concerning 
vector potentials in (see [9]). 

Theorem 3.5 in [9] shows that the solution of problem MPI, together with 
E == ^curlH and J = curlH, satisfy the Maxwell equations (l)-(5) and 
the boundary conditions (10)-(14) in a suitable weak sense. Moreover, from 
that theorem, we also have that the Lagrange multiplier is an electric surface 
potential on /], namely, 

n X (E X n) = — n x (VA* x n) — : — grad pA* on /], 

A* being a lifting of A to J? such that A* G H^(i7) and A*|r^ = 0. 

3.3 Finite element discretization 

In this section we introduce a discretization of the mixed problem MPI and 
study its convergence properties. To this goal, we assume that 1? is a Lipschitz 
polyhedron and that are polyhedral surfaces for all n = 0, . . . , A/". Conse- 
quently, is also a polyhedral surface. We also assume that a is piecewise 
smooth (e.g., on a polyhedral partition of i7. 

We consider a family of shape-regular tetrahedral meshes {Th} of f2. We 
assume that the meshes are compatible with the splitting of the boundary of 
the domain in the sense that, \/K G Th with a face T lying on 

- either T C or T C 7]^ for some n = 0, . . . , N; 

- a\x is smooth. 

The magnetic field, which is a function of = H(curl, i7), is discretized by 
the lowest-order Nedelec edge finite elements described in Section 2, i.e. we 
define Xh= M h{^) as an approximation of AT. 

Let be the triangular mesh induced by Th on the polyhedral surface 
and consider the following finite-dimensional space: 

Qiir) := {qh e : qn\T € Pi(T) VT e . 

The Lagrange multiplier will be approximated in the finite dimensional space 

:= f'h € Qfc(-Tj) : >^h\rp = constant, n = 1, . . . , ivj . 

We define the following discrete problem 

Problem DMPI.- Given I € C^, find G Xh and Xh G Ch such that 
a{Uh, Gh) + b{Gh, Xh) - 0 VG;, G AT/,, 

N 

b{Uh,i^h) = Y. 

n=l 




In^h y^h^Ch. 
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Theorem 7. Given I G , problem DMPI attains a unique solution (H/j,, X^) 
Furthermore, if the solution (H, A) of problem MPI satisfies H G H’^(curl, i?) 
with 1/2 < r < 1, then the following error estimate holds true: 
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Summary. This paper uses the general framework of space decomposition - sub- 
space correction for providing an overview of Schwarz- type preconditioners. The con- 
sidered preconditioners are one-level and two-level Schwarz methods based on an 
overlapping domain decomposition, a two-level method with the coarse grid space 
created by aggregation and a new two-level method with interfaces in the coarse 
grid space. Beside the description and analysis, we discuss also some implementation 
details, the use of inexact subproblem solvers etc. The efficiency of preconditioners 
is illustrated by numerical examples. 



1 Introduction 

This paper concerns the numerical solution of large scale linear systems arising 
from the finite element (FE) solution of boundary value problems. Especially, 
the attention is payed to that numerical methods, which are suitable for im- 
plementation on high performance parallel computers. 

This motivates the interest in space decomposition (SD) preconditioners 
described in Section 2. These preconditioners provide a general framework, 
which has many specific applications. This paper concentrates on Schwarz-type 
preconditioners, which are usually based on the overlapping domain decompo- 
sition (DD). This class itself involves many variants of the basic technique and 
some of them will be reported in this paper. There are also many other use- 
ful decompositions, which can be used for the construction of preconditioners. 
Let us mention two examples: the hierarchical decomposition of FE spaces (cf. 
[13, 20, 10]) or displacement decomposition for elasticity problems (cf. e.g. [4] 
and the references therein). 

An overview of the standard one-level and two-level overlapping DD meth- 
ods can be found in Sections 3 and 4. These methods are also called Schwarz 
methods according to the pioneering work of H.A. Schwarz from 1870, cf. [19]. 
But the real and rapid development of these methods, motivated by the inter- 
est in parallel computing, starts in the second half of 1980‘s. 

In Section 5, we describe a less standard two- level Schwarz method with 
the auxiliary global problem created by aggregation of unknowns. This method 
is useful in many cases, when application of the standard two-level methods 




Space Decomposition Preconditioners and Parallel Solvers 



21 



requires a lot of extra work involved in creating an auxiliary coarse grid and 
relating this grid to the original one. 

Section 6 is devoted to the description of a new two-level method, which 
uses non-overlapping DD and interaction of subdomain problems only via 
a special coarse grid space with interfaces. 

Some aspects of implementation of the preconditioners on parallel comput- 
ers, the use of inexact subproblem solvers and use of nonsymmetric multiplica- 
tive or hybrid preconditioners are discussed in Section 7. The precondition- 
ers, which are not symmetric positive definite, can be efficiently implemented 
within the generalized preconditioned conjugate gradient method (GPCG), cf. 
[4] and the references therein. In Section 8, we provide some numerical results 
illustrating the efficiency of the described methods and corresponding solvers. 

In the final section, we summarize the results and mention some topics not 
covered in the paper. 



2 Abstract SD preconditioners 

Our aim is the solution of abstract discrete symmetric elliptic problem in the 
following form 

find u eV: a{u^v) — l{v) Vu G F, (1) 

where F is a finite dimensional subspace of a Hilbert space V = V{f2) of 
functions defined in a domain Q d (d — 2,3), a is a bounded symmetric 
positive definite (SPD) bilinear form on V and I is a bounded linear functional 
on V. 

To be more specific, we shall consider elliptic boundary value problems 
with the bilinear form 



/ \ f 7 du dv . .. 

— ( 2 ) 

defined in the Sobolev space equipped with the seminorm | • 

the norm || • We assume that /C == (kij) is a symmetric positive definite 

dx d matrix, which guarantees the existence of positive constants ki, K 2 , such 
that 

< a{v,v) < K 2 |v||fi(n) yv£H^{Q). (3) 

Sometimes, we shall assume that V C is such that 

VueV. (4) 

Let us note that most of the presented results are valid also for other elliptic 
problems, e.g. problems of linear elasticity. 

Using an inner product ( , ) in U, the problem (1) can be rewritten into the 
following operator form 
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find u gV: Au — 6, (5) 

where A: V V and b gV are determined by the identities 

(Au,v) = a{u,v), = K'^) '^u^vgV. 

Further, we shall assume that there is a decomposition of the space V, 

y = Fi + ... + v;n, (6) 

where 14 (/c = 1, . . . , m) are subspaces of F, which are not necessarily linearly 
independent. For each subspace V^, we introduce 

— operator Ak'. Vk Vk defined by {AkU,v) = a{u,v) \/u^v G Vk, 

— prolongation operator Ik : Vk V given by the inclusion C F, 

— restriction operator Rk’. V ^ Vk given by orthogonal projection to F/., i.e. 
WvgV: RkV G Vk and {v — RkV,w) = 0 \/w G Vk. 

Note that Ik and Rk are adjoint with respect to ( , ) and Ak = RkAIk. 

The decomposition (6) allows to introduce space decomposition (SD) pre- 
conditioners. The SD preconditioner is an operator G, which can be used for 
a cheap computation of the pseudoresidual g = Gr from a given residual r. 
The pseudoresidual should provide an approximation of the error A~^r or at 
least its direction. The computation of pseudoresiduals g is realized by the 
following SD algorithm, 



g = 0 

for /c = 1 , . . . , m do 

g ^ g + IkA'j^'^Rk Zk 

end 



This SD algorithm represents several types of preconditioners: 

1. The additive preconditioner Ga- r g arises if Zk = r. In this case, 

m 

Bk=hApRk, ( 7 ) 

k=l 

is symmetric and positive definite with respect to ( , ). Moreover, 

m 

GAA^J^Pk, Pk = hA^^RkA, ( 8 ) 

k=l 

where Pk are projections V -a Vk, which are orthogonal with respect to 
the inner product induced by A, {u,v)a = {Au,v). 
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2. The multiplicative preconditioner Gm : r g arises if Zk = r — Ag. 
In this case, it is easy to verify by induction that 

GM = [I-{I-Pm)--.{I-Pl)]A-^ 

= A-^[I-iI-AB^)...{I-ABi)]. (9) 

This preconditioner is not symmetric. To obtain a symmetric multiplicative 
operator, it is possible to continue in the subspace corrections in the reverse 
order. It gives 

Gsm = [/-(/- Pi) P™) - Px)]A -^ . (10) 

For this preconditioner, it is easy to show that Gsm A is symmetric in ( , ) a ? 
which implies that Gsm is symmetric in ( , ) . Similarly, it is easy to show 
that Gsm A and Gsm are positive semidefinite. The positive definiteness 
of Gsm is equivalent to the fact that I — GmA is convergent (see Theorem 
2). 

3. The hybrid preconditioner Gh '• r ^ g arises if some residuals are 
updated and others are kept. For example, we can update only the residual 
after the first subspace correction, which gives 

m m 

Gh = Bx + J2 mi - ABx) = [Pi + 5] MI ~ • (n) 

k=2 k=2 

This preconditioner can be again symmetrized to the form 

m 

Gsh = [Pi + (/-Pi)^Pfc(/-Pi)]^-^ (12) 

k=2 

It shows that the preconditioned operator Gsh A is decomposed, 

m 

Gsh A = Pi + (/ - Pi) ^ P/c(i - Pi) , ( 13 ) 

i.e. Gsh A is equal to identity on range R{Pi) and to the additively pre- 
conditioned operator GaA on range R{I — Pi). Thus, we can expect that 
the hybrid preconditioner will be more efficient than the additive one, cf. 
also [17]. 

Further in this paper, we shall exploit the following two assumptions, which 
characterize the considered decomposition: 



11 ^^ Ko\\v\\l. ( 14 ) 



(Al) There exists a constant Kq such that 
\/v eV 3vk eVkl V = Vi-\- . . 
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(A2) There exists a constant Ki such that 

Vti e F Vvfc G U: V = vi + . . . + Vm ■■ \\v\\a < Ki'^\\vkW\. (15) 

k 

Note that |1 u||a = \/ (u, v)a- A trivial upper bound is K\ — m. But we are 
interested in m independent bounds for which can be found e.g. by 
considering the angles of subspaces 14 . If 

Ski = cos{Vk,Vi) 

= sup{{vk,vi)A' VkeVk^vieVi, \\vk\\A = ^, ||uz||a = 1}, 

S = {ski) and p{S) is the spectral radius of , then 

Ki < p{8) < max Ski . (16) 

i 

Sometimes, one of the subspaces, say 14, plays an exceptional role. Then 
it may be useful to consider 8i — {ski, k,l ^ 1) and the estimate 

Ki < 2(l + p(^i)) < 2 + 2maxY^ Ski . (17) 

k^l ^ ' 

Theorem 1. Let Xmi^{GAA), Amax(G^A^) and cond(G^A) denote respectively 
the smallest and largest eigenvalue and the condition number of the additively 
preconditioned operator A. Then 

Amin(GAA) > 1/Ko, A^ax(^AA) < cond(G^A) < KoKi. (18) 

Remark 1. The estimate of Amin (^A A) is well known as Lions’ lemma, see [16]. 
The proof of Theorem 1 as well as some historical remarks can be found e.g. 
in books [13, 20, 18]. 

Theorem 2. For the multiplicative preconditioner, we get 

||/ - GsmAWa = 11/ - GmA\W < [l - ' , (19) 

cond(GsMA) < [I ~ \\I — GsmA\\a] ^ • (20) 

Remark 2. The proof of the estimate (19) is more technical, see [7]. The com- 
plete proof can be found e.g. in [20, 18]. The estimate (20) easily follows from 
(19). 




Space Decomposition Preconditioners and Parallel Solvers 25 

3 Schwarz preconditioners based on overlapping DD 

A suitable space decomposition (5) can be constructed via overlapping de- 
composition of the computational domain i7. Now, we describe and analyse 
a typical model situation. We solve the boundary value problem (1) in i? C 
by the finite element method with linear triangular or tetrahedral elements. 
Let Th be a regular finite element division of i7, ~ max{diam(e), e G Th}. 

Moreover, let us assume that there is a division of Q into m non-overlapping 
subdomains . . . , Let be subdomains of i? aligned with the divi- 
sion Th and such that D {x G Q: dist(x, i?^) < 5/2}. Let us also denote 

Hs ~ max{diam(f?^). A: = 1, . . . , m} and assume that each point of Q belongs 
to at most rric subdomains, see Fig. 1. 



Hs h 




Fig. 1. Decomposition of i7, subdomains , fine triangulations 7^, rric — 4 



The overlapping domain decomposition 

i? = U . . . U (21) 

now induces a decomposition (6) of the finite element space V = Vh into the 
subspaces , Kn, 

F;, = G y : ^ = 0 in 12 \ . (22) 

Moreover, we get the following characterization of this decomposition. 

Theorem 3. Under the above assumptions, the decomposition V == Vi + . . . + 
Vm with the local FE spaces (22) fulfils the assumptions (Al), (A2) with the 
constants 



Ko = C(1 + <5-2) 



Ki = rric. 



(23) 
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Remark 3. The proof of Theorem 3 can be found in the works of M. Dryja 
and O. Widlund, see the books [13, 20] and the references given there. We 
sketch the proof here, because it contains ideas important for understanding 
the further development of the method in the next sections. 

Sketch of the proof. 

— For the overlapping domain decomposition, there is a partition of the unity 

m 

9k = 0 in R‘‘ \ nf. , y^6»fc = 1, ||grad 0 ^ 11 ^ 00 (^ 7 ) = 0(5“^) . 

fc=i 

— Using this partition of unity and the piecewise linear interpolation 
Ilh ’ C(i?) ^ 14,, we can decompose any element G 14, as follows, 

V = '^Vk, Vk = IIh{9kv). 

k 

— Property (Al): 

E E 

k k k TcO^, TeTh 

On the element level, it holds that 



\nh{^kV)\H^(T) ^ M ^ II'^IIl 2(T) + I'^li/1(T) 



see e.g. [10], Lemma 7.4.17 or [12]. Moreover, T C Th can belong to at most 
rUc subdomains Thus, 

^IIVfcllA < [S~‘^\MYT) + Mh{T) 

k TETh 

< {5~^ + 1) IbllHi(fi) 

< K2fJ.mcKQ^ (5~^ + l) ||v||^ . 



Note that kq, ^2 are constants from (3), (4). 
- Property (A2): 



II E^fc 11^ = = E E ^T(vi,vj) = Y 

k ij ij TeTh TeTh ij 

< Y E \/aT{vi,Vi)^JaT{vj,Vj) 

TeTh ij 

= Y E °-T{vi,Vi) Y \! aT{vj,Vj) , 

TeTh i j 
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where arivi^Vi) denotes the restriction of the bilinear form a to T. Each 
T ^ Th belongs to at most rric subdomains and therefore ^ y/ari'^i^yi) 
has at most rric nonzero terms. The Cauchy-Bunyakowski-Schwarz (C.B.S.) 
inequality thus gives 



Y^^/arivi.Vi) < y/^JY^arivi^Vi). 



Hence, 



II '^Vk ||i< y] mc'Y^aT{vk,Vk) = rricY^W vu ||i . 

k T^Th k k 



□ 

Normally, the overlap 5 is kept proportional to the size of subdomain Hg, i.e. 
5 — l3Hs, where j3 is the proportionality constant. In this case, a combination 
of Theorems 1, 2, 3 says that cond{GAA) and cond(G^5M^) deteriorate with 
the increasing number of subdomains {Hg — > 0, d — > 0). A remedy can be found 
in adding an auxiliary coarse grid FE space Vo to the space decomposition with 
local FE spaces (22), see the next section. 



4 Two— level Schwarz preconditioners 

Let us consider the extended decomposition 



E - + + (24) 

where V = is the FE space corresponding to the FE division Th, Vi, , Vm 
are the local FE spaces (22) and Vq == Vh is the FE space corresponding to a 
coarser FE division Th of the domain i7. We assume that 7^ is a refinement 
of Th^ which guarantees that Vh cVh. Moreover, we assume that H < Hg. 

In a more general case of non-nested grids, we could use a prolongation 
(interpolation) : Vh Vh {Ih = ^h) and choose Vb = ^h^h, see [12]. 
The subproblem operator Aq corresponding to Vq is determined by restriction 
of the variational formulation to Vq. There is also the relation Aq = AI^. 

Theorem 4. The decomposition (24) fulfils the assumptions (Al), (A2) with 
the constants 



Ko = C{1 + 5-^H ^) , = 2(1 + me) . (25) 

Sketch of the proof. 

— For the analysis, we shall use the fact there is a mapping Q: Vh ~^Vq and 
constants cJi, (72 such that 
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Ih “ Q'^\\l2{0) < (^2H\v\H-L{f2) . 



(26) 

(27) 



The above properties can be proved e.g. if Q is the L 2 -orthogonal projection 
into Vb, see [8]. 

— For any v G we take a decomposition v = vq -\- vi Vm, vq = Qv^ 

Vk = Uh{Ok{v - i;o)) for /c = 1, . . . , m. 

— For this decomposition, 



A < l^2\vo\Hi(a) ^ < K2CrlK^^\\v\\\. 



Further, 



k=lTcf2l,TeTh 



\Vk\j 



k=l 



k=l 



m{T) 

k=iTcnl,TeTh 
< K2lJ-rnc [<5“^||v-fo||i,(fi) + \v-vo\%i(f2) 

<K2M™c <72 <5 + 2|v||i-i(f2) + 2|vo|lfl(^2)j 

-2 tt21 



< K2 M ™C [2 + 2(t 2 + cr| <5 

<c[l + (5-2F2j ii^ii^ 



— The above estimates show that, Kq = C{1 + <5 

— The estimate of Ki follows easily from (17) and the estimate (23). 



Remark 4- The proof of Theorem 4 can be found again in the works of M. Dryja 
and O. Widlund, see also [13, 20, 10]. The estimates (25) can be strengthened. 
In [14], it is proved that Kq = C{1 + S~^H). This estimate is sharp, cf. [9]. 



Remark 5. The constant C depends on /^i, a^ 2 , which indicates some depen- 
dence on the anisotropy and jumps in coefficients of the bilinear form, cf. [2]. 
On the contrary to Theorem 3, the assumption (4) was not used here. 



If Th arises as a refinement of a coarser FE division Th, then the use of 
Vq = Vh is fully natural. In other cases, it may be impractical and costly to 
construct an extra division Th together with the interpolation and coarse 
grid operator Aq = Ah- For these cases, it may be advantageous to use another 
construction of the auxiliary global space Vb by aggregation. This construction 
will be described in the next section. 



5 Two— level Schwarz preconditioners with aggregations 

Let V = Vh = spanj^i, . . . , and let the index set {1, . . . , n} be decomposed 
into groups Gi, . . . Then it is possible to define aggregated basis functions 
'ipk and the space Vq C F as follows. 
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'Pk = <f>i , Vo = span{i/>i, . . . , -iPn} ■ ( 28 ) 

ieGk 

We shall assume that the aggregations are regular, i.e. there is a constant p 
such that each supp'i/^A; contains a ball with diameter pH, where 

H ~ max diam(supp'0/c) < Hg . 

k 

As a consequence, there are positive constants Ci, C 2 such that CiH^ < 
|supp'0/c| < C 2 H^. The space Vq from (28) together with the local FE spaces 
Vi, . . . , Vm from ( 22 ) create a new decomposition 



y = Fo + + . . . + 14 , . (29) 

Theorem 5. The decomposition (29) fulfils the assumptions (Al), (A 2 ) with 
the constants 



Ko = C{1 + h~^H + , Ki - 2(1 + me) . (30) 



Sketch of the proof We can follow the same procedure as in the proof of The- 
orem 4 but instead of L 2 -projection to 14, we shall consider an averaging 
operator Q. 

— Let us consider the mapping Q: T4 — > Vb defined as follows: 



N 

loxv&Vh, Qv = akipk , Oik 

k=l 

For this operator, it holds that 



1 

|supp V'fcl 




(31) 



11^^ - Qv\\l2{0) < 52-ff \v\H^(n) ) 



see [11, 15] for the details. 

— For any v € F, we take the decomposition 



(32) 

(33) 



V = Vo+Vi+...+Vm, Vo = Qv, Vk ■= IIh{Ok{v -Vo)) , fc = l,...,m. 
— Then 

H H 

llt^olli < ^^2\vo\HHn) < — < K 2 aj Kp — \\v\\\ . 



Further 
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k=l 



VI 


f^2 ^ 


= 






k=l 




k=iTcnl 


.Ten 


< 


K2 rric 1 


[(5 ^\\v- 




\v - 




< 


K2 /J^ rric 


al + 2 + 2al 


H' 

~h_ 


1'^ 


< 


c [1 + 


-^H + 5- 









— The above estimates give (30). The estimate of K\ is the same as in Theo- 
rem 4. Q 

The space Vq created by aggregation was firstly introduced in multigrid 
context, see e.g. [3] and the references therein. The properties of the basis 
functions {V^fc} can be improved by smoothing and the smoothed aggregations 
can be again used in two-level Schwarz preconditioners, see [11]. 

Note that the paper [15] describes the use of massive aggregation, which 
means aggregation of all degrees of freedom in each subdomain. According to 
our experience, less massive aggregation is more efficient and better balanced 
with the other subproblems for smaller numbers of subdomains. 

For structured grids, the aggregations can be done by regular clustering. 
Some algorithms for creating aggregations for unstructured grids are described 
e.g. in [11]. 



6 Two-level Schwarz preconditioners with interfaces 

The considered two-level Schwarz preconditioners involve two ways of inter- 
action between the subproblems: via the overlap and via the coarse grid. In 
this section, we would like to show that the interaction via the coarse grid is 
sufficient if the coarse grid involves the interfaces between the subdomains. 

Thus, let . . . , be a non overlapping decomposition of the domain 
Q and each be aligned with a fine FE division Th of the domain Q. Let 

Vk = {vgV-. V = 0inf2\at}, W = Vx + ... + Vm. (34) 

Then W cV, but W ^ V. If there is a coarse grid space Vq such that 



F = F0 + F1... + T4, = + ( 35 ) 

then Vo should contain all degrees of freedom corresponding to the interface 
r = U Az, where Fki = {dQk H dQi) \ dQ. 

There are several possibilities how to construct a proper space Vq . The first 
possibility is to use a coarse grid Th providing enough nodes on the interface 
T, i.e. node(7}f) fi F = node{Th) n F. The coarse grid Th should be also 
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Fig. 2. A coarse grid with interfaces (left) and its refinement with the domain 
decomposition (right) 



compatible with the subdomains and 7^ should be a refinement of 7}/. 
Such situation can be seen in Fig. 2. 

Another more flexible possibility how to create a proper space Vq is the 
construction by aggregation. This construction can be similar as in the previous 
section, the only difference is that we restrict the aggregation to that degrees 
of freedom, which do not correspond to the nodes on the interfaces between 
the subdomains, see Fig. 3. 
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= 


n 


□ 




Fig. 3. Fine grid and domain decomposition (left) and aggregation out of the inter- 
faces between the subdomains (right) 



Theorem 6. The decomposition (35) with subspaces hi, ... , Vm corresponding 
to non- overlapping subdomains and the coarse grid space Vq with interfaces 
fulfils the assumptions (Al), (A2) with the constants 

Ko = 1/(1 - 7), = 2, (36) 

where 7 = cos(Vb, Wq) < 1 for any Wq C W such that V = Vo ® Wq. 

Proof Any v G V can be written as u = uq + t’o G Vq, wq G Wq C 
W. Moreover, wq = + . . . + Vm, where Vk G Vk for k = 1, . . . , m. The 

subspaces , . . . , Vm corresponding to the non-overlapping subdomains are 
{^)a orthogonal, thus 
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^IkfcllA = \\vi+...+Vm\\\ = |ko||4> 

m 

53lkfc||?i = ||vo|li + IIwoIIa < :j -IIvo + woIIa = :j I^IIa- 

On the other hand, for any v ^ V ^ v = vq v\ Vra^ where Vk G Vk 

for /c = 0, 1 , . . . , m, we have 

m 

II^IIa < 2||vo||a + 2||vi + ... +t'm||A = 2y]||Vfc||A- 

k=0 



□ 

Note that the constant 7 < 1 appeared also in the analysis of the hybrid 
preconditioner Gh- We simply get 

11 / - GhAWa = II - E - ^ 0 ) 11^ = ^ ’ (37) 

because Pk is now again {,)a orthogonal projection. Note also that 

Gh = Gm in this case. More details can be found in [5]. 

As a consequence of Theorem 6, the investigation of convergence of the 
two-level Schwarz method with interfaces can be reduced to investigation of 
the constant 7 = cos(Vq,Wo). This investigation of the strengthened C.B.S. 
inequality can be performed locally on macroelements. It will be illustrated for 
a simple 2D problem from Fig. 2 in the following theorem. More results will 
be given in a forthcoming paper [5]. 

Theorem 7. Let us consider the problem (1) with orthotropic bilinear form (2) 
on the domain Q C. B? and the situation similar to Fig. 2. It means that there 
is a coarse grid Th with coarse rectangular triangles inside the subdomains and 
special rectangular macroelements and fine triangles along the boundaries of the 
subdomains, see Fig. 2, f. There is also a refined triangulation Th, which con- 
sists from congruent rectangular isosceles triangles with two axiparallel sides. 
These triangles arise from Th by division of each coarse triangle and each 
boundary macroelement into four congruent triangles, see Fig. 4- 

Let Vq be the FE space corresponding to Th and W be the union of subspaces 
Vk defined in (34). Let Wq contain the functions from W, which are zero in 
the nodes belonging to Th- The nonzero coefficients of the bilinear form {kn 
and k 22 )) assumed to be constant on the macroelements from Th- 
Then 



^11 



^22 \ 



7 — cos(Vb,Wo) < max 



2 ’ V ^11 + ^22 



^11 H" ^22 



(38) 
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Fig. 4. The coarse triangles with refinement (a) and boundary macroelements with 
3 triangles and their refinement into 4 triangles (b),(c) 



Proof. The C.B.S. constant can be investigated locally on the inner and bound- 
ary macroelements. If 



aE{v,w) 



L 



^ ""dx- dx 

yJJif'i KJJb'i 



and 

aE{v,w) < ')e\/o.e{v,v) \/aEiw,w) , 

for each macroelement E and functions v and ic, which are restrictions of 
u G Vo and w G Wq to the macroelement then 



7 < max7£;. 
E 



For the inner macroelements (coarse triangles), it is possible to show that 
7e ^ see e.g. [1] (the estimate is a simple generalization of Remark 3.1.). 
Now consider an interface macroelement E of the type shown in Fig. 4(b). 
For u G Vo , we get 



dv 

dxi 



|Ti ^ 



— It 
dxr 



dv 

dx2 



|Ti 



dv 

dx2 



\t2 



For w G Wo, we get = ^7 — 0 in T3 and T4. Further, 



dw dw dw dw 

dx2 dx2 ’ dxi dxi ’ 



dw 



dw 



^ — 1^1 ■q — 1^1 

dxi dx2 



Thus, 
aE{v,v) = 

JE 



= / kn 
E 



dv \ ^ f dv \ ^ 

dxi) ^^\dx 2 j 



dx 



>2 / 

Jti \9xi 



dx^ 






V 



dx, 



aE{v,w) 



/ 



T1UT2 



‘^ll 



dv dw , dv dw \ ^ f . dv dw 

h k22 \ d.T. = 9. I k.. 



dxi dx] 



dx2 dx2 J 



dx 



= 2 / kii^^dx 

J dx\ dx\ 



Ti 
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< ]j k^^^k22 '^)VO‘E{w, w) i.e. JE = \J 



«11 



hi + ^22 
A similar estimate — \J 



kii + k22' 



^22 



kii-\-k22 proved for the interface 

macroelement E of the type shown in Fig. 4(c). q 



Note that for isotropic bilinear form, we get 7 = y ^ , which indicates very 
good efficiency of the two- level Schwarz preconditioners with interfaces. The 
efficiency is not deteriorated by jumps in coefficients if these jumps do not oc- 
cur within the macroelements. On the other hand, the efficiency deteriorates 
if the subdomain boundaries cut the main direction of a strong anisotropy. 
Therefore, for strongly anisotropic problems, it may be useful to use domain 
decompositions cutting only the weak direction of the anisotropy. The numer- 
ical (grid) anisotropy has of course the same effect as the physical anisotropy. 

The idea of using Schwarz preconditioner with no overlap but a suitable 
coarse grid appeared also in the paper [2], but in a context with a different 
construction of the coarse grid problem. 



7 Parallel implementation of the preconditioners 

Let us consider the solution of a symmetric elliptic boundary value problem by 
the FE method. Then we have to assemble and solve linear algebraic system 

Au = b, u,beR^, (39) 

with a symmetric positive definite n x n matrix. This system is an algebraic 
representation of the problem (5). The system (39) is mostly solved by the 
preconditioned conjugate gradient (CG) method. 

The solution of large-scale systems requires to use powerful parallel com- 
puters and the domain decomposition can be then exploited for parallel im- 
plementation of all main operations: 

— assembling of the system, 

— matrix-by-vector multiplication, 

— computation of vector updates and scalar products, 

— construction of preconditioners. 

This parallel implementation uses decomposition of data (vectors, matrices) 
into blocks corresponding to the subdomains. The overlapping DD induces de- 
composition of data to overlapping blocks v = {v^} etc. The matrix-by- vector 
multiplication can be implemented with both overlapping and non-overlapping 
blocks. The first case may be advantageous if the extended diagonal blocks 
/c == 1, . . . , m, cover all nonzero elements of A. 

Blocks of data are mapped to the processors of the parallel computer and 
the domain decomposition or more precisely the decomposition of the FE grid 
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should be done with a special care to the computational load of processors. 
This is not difficult for structured grids. For a suitable decomposition of un- 
structured grids, it is possible to use several graph algorithms, see e.g. [18]. The 
decomposition should also respect anisotropies as mentioned in Section 6 and 
further in Section 8. For two- level preconditioners, it is additionally needed to 
balance the size of the auxiliary global problem. 

The space decomposition preconditioners require to solve the subproblems 
of the type AkWk = Vk- Although these subproblems are smaller than the 
original problem, it may be still too expensive or inefficient to solve these 
systems exactly by direct solvers. Then the exact solution of the subproblems 
can be avoided by different means. 

First, we can use an approximation Ak to Ak for computing approximate 
values of A'j^^Vk. This approximation can be given by incomplete factorization 
of Ak or by applying a fixed number of iterations with some linear stationary 
iterative method as SSOR or multigrid. The use of approximate operators Ak 
is covered by the theory described e.g. in [20]. 

Second, we can use inner CG iterations for solving AkWk — Vk approxi- 
mately up to some lower relative inner accuracy \\AkWk — Vk\\ < where 

^0 is say 10“^. This choice gives SD preconditioners, which are not more linear 
operators. We can speak about nonlinear preconditioners. These precondition- 
ers can be implemented within the standard preconditioned CG method, but 
the resulting inner-outer iterative method can fail or lead to a slow conver- 
gence. The reason for such difficulties is the loss of orthogonality and sometimes 
also loss of linear independence of the search directions in CG method with 
nonlinear preconditioning. 

A remedy is to store (some) search directions and orthogonalize the new 
one to the stored ones. Such flexible or generalized preconditioned CG method 
is described e.g. in [4] and the references therein. In the next section, we shall 
use the generalized preconditioned GPCG[s] method with orthogonalization of 
the search direction to s previous ones. 

The GPCG[s] method also allows to use nonsymmetric multiplicative or 
nonsymmetric hybrid preconditioners. These nonsymmetric preconditioners 
are substantially cheaper than the symmetrized variants requiring at least 
one extra update of the residual and frequently leading to similar convergence 
rates as the symmetrized variants. 

The convergence of GPCG[s] with nonlinear approximate additive SD pre- 
conditioners can be proved with the aid of the convergence theory, which can 
be found e.g. in [4]. The convergence of GPCG[s] with nonlinear approxi- 
mate multiplicative or hybrid preconditioner G = Gm , Gh can be proved if 
II / — luGA\\a < 1 for some oj G (0, 1), see again [4]. 
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8 Numerical results 

The efficiency of various SD preconditioners described in this paper can be 
compared by solving a simple model problem 

O'^lL 0‘^U 

hi-^ + k22-^=f in J? = (0, 2) X (0, 3) , 
u = 0 on dQ , 

discretized by linear triangular FE. We use the uniform grid with the mesh 
size h = 1/30, subdomains f2k — x {xk,Xk-\~i) and overlap S = 2h. The 
subproblems are solved exactly. The required numbers of iterations for the 
accuracy £ = 10“^ and various additive (AP) and hybrid (HP) SD precondi- 
tioners can be seen in Table 1 . The hybrid preconditioners are used in nonsym- 
metric form in combination with GPCG[1]. The coarse grid is either nested 
coarse triangular grid, aggregation with clustering 2x2 square macroelements 
or the same aggregation with interface. For each number of subdomains, the 
first column shows the numbers of iterations for ku = k 22 — 1, the second one 
for kii k 22 = 1/100 and the last one for kn = 1, k 22 = 100. 



Table 1. The numbers of iterations for the relative accuracy e = 10 AP de- 
notes additive preconditioner, HP denotes nonsymmetric hybrid preconditioner -h 
GPCG[1]. 



Type 


Coarse grid space 


Number of subdomains | 


4 


8 


12 


16 


24 


AP 


one level 


16 


4 


26 


22 


4 


38 


23 


5 


51 


31 


5 


61 


37 


6 


78 


AP 


nested, H = 3h 


7 


7 


16 


8 


7 


20 


8 


7 


23 


7 


7 


26 


8 


7 


31 


HP 


nested, H = 3h 


6 


6 


13 


6 


7 


16 


6 


7 


20 


6 


7 


22 


6 


6 


27 


AP 


regular agg’s 2h 


13 


8 


19 


15 


8 


23 


16 


9 


27 


17 


9 


30 


17 


9 


36 


HP 


regular agg’s 2h 


10 


7 


15 


11 


7 


19 


11 


7 


22 


11 


8 


25 


11 


8 


29 


AP 


agg’s 2h -h int. 


14 


8 


22 


14 


8 


30 


14 


8 


36 


14 


9 


40 


14 


9 


46 


HP 


agg’s 2h + int. 


8 


4 


14 


7 


5 


16 


8 


5 


19 


8 


5 


22 


8 


5 


25 



For demonstration of efficiency of parallel solvers based on CG method and 
additive Schwarz preconditioners, we give in Table 2 the numbers of iterations 
and computer times from solving large-scale linear system with the dimension 
of about 4 million arising from FE discretization of 3D elasticity problem, 
which has a practical application in geomechanics. More details can be found 
e.g. in [6]. The adopted discretization uses linear tetrahedral FE. 

The computations use one directional decomposition up to 8 subdomains 
and coarse grid problem created by regular aggregation of 2x2x2 or 5x5x5 hex- 
ahedral macroelements. The computations were performed on a small Linux 
cluster consisting of 8 PCs with AMD Athlon/1400 processors and 768 MB 
memories. The computers are interconnected via standard Fast Ethernet 100 
Mbit/s network. 
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Table 2. Results from the solution of the large-scale 3D elasticity problem with the 
relative accuracy £ = 10“^. denotes the number of exploited processors, m is the 
number of subdomains. 





No 


coarse 


grid 


Aggregation 2x2x2 


Aggregation 


5x5x5 




#p 


#it 


T[s] 


#p 


#it 


T[s] 


#p 


#it 


T[sJ 


2 


2 


104 


428 


3 


45 


268 


3 


56 


243 


3 


3 


113 


321 


4 


47 


242 


4 


59 


176 


4 


4 


121 


268 


5 


51 


240 


5 


62 


144 


5 


5 


128 


234 


6 


53 


236 


6 


65 


127 


6 


6 


133 


211 


7 


55 


244 


7 


67 


114 


7 


7 


136 


193 


8 


57 


265 


8 


70 


105 



9 Concluding remarks 

We used the space decomposition framework for providing an overview of the 
Schwarz-type domain decomposition preconditioners. We described a large va- 
riety of these preconditioners, still not including some newly developed vari- 
ants. 

Our experience shows that the Schwarz preconditioners can be used for 
development of efficient and scalable parallel solvers at least when working 
on smaller parallel computing systems. These methods are flexible enough for 
balancing the computational load of the processors, adopting inexact solvers 
etc. On the other hand, we showed that a special care should be devoted to 
physical or numerical anisotropy and similar difficulties. 

The Schwarz technique can be also used for solving other problems as 
nonsymmetric problems of convection- diffusion type, parabolic and nonlinear 
problems and many others. The general SD framework is suitable also for 
developing preconditioners based on hierarchical or physical decompositions, 
see e.g. [13, 20, 21, 4]. 

Acknowledgement: Many thanks are due to P. Byczanski and J. Stary 
for preparing the numerical results. The work was supported by the grant 
S3086102 of the Academy of Sciences of the Czech Republic. 
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Summary. Different types of boundary conditions in industrial numerical simula- 
tors involving the discretization of hyperbolic systems are presented. For some of 
them, one may determine the problem to which the limit of approximate solutions 
(as the discretization parameters tend to 0) is the unique solution. In turn, this con- 
vergence result may suggest other ways to take into account the boundary conditions. 



1 Introduction 

In the industrial context, efficient numerical simulators are often developped 
after a long “trial and error” procedure. The efficiency of the simulators may 
be evaluated, for instance, by the fact that the solution satisfies some natu- 
ral constraints and that it is in agreement with experimental data. In some 
cases, estimates on the approximate solutions allow to obtain the convergence 
of some sequences of approximate solutions as the discretization size tends 
to 0. However, it is not easy to give the answer to the following question: 
“What problem has a unique solution which is the limit of the approximate 
solutions?” . 

This paper will focus on the problem of boundary conditions needed in the 
discretization of nonlinear hyperbolic equations or systems of equations; this 
problem is not yet clearly understood in many cases. Two different cases will 
be presented: a two phase flow in a pipeline and a two phase flow in a porous 
medium. 



2 A two phase flow in a pipeline 

2.1 Description of the system 

A “simple” model for a two phase flow in a pipeline (see [8] , for instance) leads 
to a 3 X 3 system of conservations laws. The unknown ic is a function from 
(0,1) X R+ in , solution of the following system: 

wt + {F{w))x =0, X e (0, 1), t G R+, (1) 

where (-)t and (-)a; denote the derivatives with respect to t and x variables. 
The first two equations of (1) give the mass conservation of the 2 phases (gas 
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and liquid) and the third one is the momentum equation for the mixture. The 
expression of the given function F : ^ is quite complicated. It takes 

into account thermodynamical laws and a hydro dynamical law. System (1) 
is hyperbolic: for any w G R^, the Jacobian matrix DF(w) is diagonalizable 
in R. The three eigenvalues can be ordered: Ai(u;) < X 2 {w) < Xs{w). In 
real situations, the first eigenvalue, Ai(il;) is negative and the third, Xs{w), 
is positive (they correspond to some “pressure waves” which are related to 
a “sound velocity”). The second eigenvalue, X 2 {w)^ corresponds to some mean 
velocity between the two phases and can change sign. One can also note that 
the field related to this second eigenvalue is quite complicated because it is 
not, in general, a genuinely nonlinear field or a linearly degenerate field. In 
petroleum engineering, the wave associated to this second eigenvalue is a “void 
fraction wave”; engineers require a good representation of this wave in the 
numerical simulations. 

Remark 1. In real situations, the function F in System (1) also depends on x, 
in order to take into account, for instance, the variation in the slope of the 
pipeline. Moreover, some source terms have to be added to the system, in order 
to take into account, for instance, some friction terms. 

In order to complete System (1), an initial condition is prescribed: 

w{x, 0) = wq{x), X e (0, 1), (2) 

and it is also necessary to give some boundary conditions. This appears to 
be not so easy. Indeed, classically, a general principle is that the number of 
boundary conditions needs to be equal to the number of positive eigenvalues of 
the Jacobian matrix at x = 0 and to the number of negative eigenvalues of the 
Jacobian matrix at x = 1 (and these boundary conditions have to satisfy some 
compatibility conditions). However, this principle is not so easy to understand 
when an eigenvalue changes sign during the simulation (or in the case of a null 
eigenvalue). A very interesting case is the so called “severe slugging” case in 
a pipeline. For this case, there are always two positive eigenvalues at x = 0 
and two natural boundary conditions are prescribed at x = 0, namely the 
fiuxes of gas and liquid; these boundary conditions can be taken constant in 
time. At X = 1, there is one natural boundary condition, namely the pressure 
(which is the same for the two phases, in this model), to be prescribed. It can 
also be constant in time. The true physical solution, which is measured by 
experiments (and the aim is to modelize these experiments), is periodical in 
time and it appears that, at x = 1, the first eigenvalue is always positive and 
the third one is always negative but the second eigenvalue changes sign during 
the simulation. In the sequel, one presents different ways to take into account 
the boundary conditions and one gives a convergence result in a simplified 
case. 
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2.2 Discretization of the problem 



In order to discretize Problem (1), (2) and some boundary conditions, which 
will be introduced later, let h = ^ (with N G be the mesh size and k > 0 
be the time step (assumed to be constant, for the sake of simplicity). The 
discrete unknown are the values wf G for z G A^} and n G N. The 

discretization of the initial condition leads to 



1 

— - wo{x)dx, z G {1, . . . , N}. (3) 

For the computation of for n > 0, one uses an explicit, 3-points scheme: 



h 



{w. 



,n+l 



■O + F^ 



i+i 



: 0, z G {1, • . . 5 ^ C N. 



(4) 



For z G 1, one takes F^i = g{wf where g is the numerical 

flux. It has to satisfy, in particular, the classical consistency condition, namely 
g{a^a) = F{a)^ and needs to be chosen in order to obtain some stability 
properties for the numerical scheme under a so called CFL condition on the 
time step (see Sect. 3 for the study of a scalar model). In the case of two 
phase flow in a pipeline, the classical numerical fluxes such as the Godunov 
flux (see [9]) or the Roe flux (see [11]) may not be implemented, because of 
computational difficulties. A convenient choice is obtained with a simplified 
Roe flux, namely g{a^b) — + ^\A{a,b)\{a — b), where A(a, 6) is some 

appoximation of the Jacobian matrix, depending on a and 6, but not satisfying 
the so called Roe condition, see [8]. 



Remark 2. In fact, for the simulation of a two phase flow in a pipeline, the 
magnitude of the so-called fast eigenvalues, Ai and As, is much greater than 
that of A 2 ; the choice in [8] is to use an implicit scheme with respect to the fast 
eigenvalues, whereas the eingevalue A 2 , which corresponds to the void fraction 
wave, is handled with an explicit second order discretization, since the void 
fraction wave needs to be simulated precisely (see [8] for details). 



Let us now define the fluxes FJ and , 1 at the boundary. 

2 4^ + 2 



2.3 Boundary conditions for the discretized problem 

In order to compute F? (and similarily F^ 1 ) a good way is to know, or 

to determine, some artificial value Wq G (and G R^) and to take 

Ff = gQ{wQ^Wi) (and F^_^i g\{w'^^w'^j^^)). The numerical fluxes go and 

gi can be chosen equal to but this is not at all necessary (see the convergence 
result of Sect. 3); in fact, there are numerous situations where one should take 
go and gi different from g. Indeed, the scheme is often very sensitive to the 
computation of the boundary fluxes and it is often worthwhile to use a more 
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precise, but also more expensive numerical flux (such as the Godunov flux, for 
instance) for the computation of the boundary fluxes than for the computation 
of the interior fluxes. The difficulty is now to determine these artificial values, 
Wq and 

Remark 3. In some cases, the choice of Wq and is quite easy. A well 

known example is given by the wall-boundary condition for the Euler equa- 
tions (with a perfect gas state law or a more general state law). For the sake 
of simplicity, let us mention the one-dimensional case; the generalization to 
a multi-dimensional case is quite easy. The Euler equations may be written 
the form (1), corresponding to conservation of mass, momentum and energy, 
with w = {p^pu^EY, where p is the density of the fluid, u its velocity, and 
E its energy. The wall-boundary condition at x = 0 is rt = 0, and the only 
component to compute for the boundary condition is the second component of 

Ff which is equal here to the pressure at x == 0 (since u = 0 at the wall), say 
2 

The value may be computed from the values pj, and p^. A natural 

choice for Wq is to take pg” = pj, itg- = —u^ and pg — Pi • The flux F'l (that is 

2 

the value p7) is then obtained with F^ — go{wQ ^w'^) and a convenient choice 
2 2 

of the numerical flux go. We suggest to choose go as the Godunov flux (or as 

a linearized Godunov flux, see [3] for instance). Numerical tests which were 

performed in [3] show that this choice is very satisfactory, even in the difficult 

case of a strong depressurization at the boundary. These tests also show that 

the pressure obtained with the Roe flux is not so satisfactory and neither is 

the choice p? — Pi which may seem natural (in particular, in 2D simulations, 
2 

using a dual mesh obtained with a finite element primal mesh). 

In most cases, however, the choice of Wq and is not so easy. A possible 

method, which is described in [4], is now layed out, for a fixed n and go given: 

1. Compute DF{wi), its eigenvalues {Ai, A2, A3} and a basis of R^, 

{pi, P2, <Ps}, such that DF{w^)(fi = A^p^, i = 1, 2, 3. 

2. Write on the basis {pi, p2, ps}, namely Wi = <aiPi + (^ 2^2 + <^3P3, 

3. Let p be the number of positive eigenvalues, compute Wq = /?ipi 4- /?2P2 + 
/?3P3 and = go{wQ ,Wi), where the three unknowns /3i, /?2, and (3s are 

2 

determined by the p equations stating the boundary conditions (note that 
these equations involve the components of F?) and by the 3 — p equalities 

2 

(3i = ai for \i < 0. 

This method leads, at each time step, to a nonlinear system of 3 equa- 
tions with 3 unknowns (except if A^ = 0 for some z), namely /3i, (32 and (3s] 
note that some compatibility conditions are needed in order that this nonlin- 
ear system has a solution. Several variants of this method are possible. For 
instance, a boundary condition may be imposed on Wq rather than F^. A sim- 

2 

ilar method is, of course, possible at point x = 1 (changing the role of positive 
and negative eigenvalues). 
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This method is not always satisfactory. In the case of severe slugging for 
the simulation of two phase flow in a pipeline, the method seems to perform 
well at X == 0, where the eigenvalues Ai and A 2 are always positive and the two 
boundary conditions (gas and liquid fluxes) are convenient. However, at a: = 1, 
the second eigenvalue sometimes becomes negative and one needs a second 
boundary condition (the first one is a condition on the pressure). A natural 
condition seems to he Qi == 0, where Qi is the second component of the flux F, 
that is the liquid flux, but this condition does not lead to good results. Other 
possible choices of this additional boundary condition at x = 1 were tested and 
did not give good results. A possible interpretation of this problem is the fact 
that the sign of A 2 is computed with w'^. Roughly speaking, it is “too late” 
when \2{w^) becomes negative (see Sect. 3 for the study of a simple scalar 
case). Indeed, good results (in agreement with experiments) are obtained with 
the unilateral condition Qi > 0 (whatever the sign of It consists in 

using the preceeding method (for the boundary condition at x = 1) and in 
replacing, in the numerical scheme (4), the second component of 1 by its 
positive part. Then, if \2{w'^) < 0, two boundary conditions are given at x = 1 
(pressure and Qi — 0) and if \2{w^) > 0, one boundary condition is given at 
X = 1 (pressure) but, in (4), the second component of is replaced by its 

positive part. 

In the following section, we will try to understand the sense of this bound- 
ary condition in a simplified scalar case. 



3 The scalar case 

A general convergence result is presented here in the case of a scalar equation. 
Then this result will be applied to understand the sense of the boundary 
condition, described at x = 1 in the previous section, in a simplified scalar 
case. 



3.1 A general convergence result 

The unknown is now a function u : (0, 1) x RH — > R. The flux is a function 
f E (R, R) (or / : R R Lipschitz continuous) and the initial datum is 
uq G L°^((0, 1)). Let A, R G R be such that A < uq < B a.e.. The problem to 
solve is: 



ut + if (u))^ = 0, xe (0,1), teR+, (5) 

with the initial condition: 

u{x, 0) = uq{x)^ X G (0, 1), 

and some boundary conditions which will be prescribed later. 



( 6 ) 




44 



T. Gallouet 



As in the previous section, let h = (with N G N^) be the mesh size and 
k > 0 he the time step (assumed to be constant, for the sake of simplicity). 
The discrete unknowns are now the values G R for z G , AT} and 

n G N. In order to define the approximate solution a.e. in (0, 1) x R, one sets 
Uh,k{x^ t) = for X £ {{i — l)/i, z/i), t G (n/c, (n + 1)A:), z G {1, . . . , N}^ n G N. 

The discretization of the initial condition leads to 

I 

= T- / uo{x)dx, i e (7) 

For the computation of zz^ for n > 0, one uses, as before, an explicit, 
3-points scheme: 



For z G 1,...,AT — 1, one takes 



0, zG 71 G N. 



( 9 ) 

where g is the numerical flux. Sufficient conditions on g : [A, ^ R, in 

order to have a convergent scheme if x G R instead of (0, 1), are: 

Cl: g is non decreasing with respect to its first argument and nonincreasing 

with respect to its second argument, 

C2: g{s,s) — /(s), for all 5 G [A, J5], 

C3: ^ is Lipschitz continuous. 

Let L be a Lipschitz constant for g (on [A, R]^) and ^ > 0. If (0,1) is 
replaced by R, it is well known (see e.g. [4]) that, if A: < (1 — C)^, the approx- 
imate solution that is the solution defined by (7)- (9) (with z G Z), takes 
its values in [A, R] and converges towards the unique entropy weak solution of 
(5)- (6) in Lf^^(R x R+) as /z ^ 0. 

In the case x G (0, 1) instead of a; G R, one assumes the same conditions 
on g, namely (C1)-(C3). In order to complete the scheme, one has to define 
/|and/”^i. 

Let u,u G L°°(R+) be such that A < u,zz < R, a.e. on R+, let 
go,gi- satisfying (C1)-(C3), and define: 



fl+L=9i{^NX,y, 






( 10 ) 



Then a convergence theorem can be proven as in the case x G R, see [13]: 



Theorem 1. Let f G C^(R, R^ (or f : R R Lipschitz continuous). Let 
uq G L°°((0, 1)), iz, z7 G L°°(R+) and A^B G H be such that A < uq < 
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B a.e. on (0,1), A < u^u < B a.e. on R_^. Let go^gi • [A^B]^ — > R, 

satisfying (C1)-(C3). Let L he a common Lipschitz constant for g, g^ and g\ 
{on [A^B]^) and let C > 0. Then, ifk < {1 — Q^, the equations (7) -(10) define 
an approximate solution which takes its values in [A, B] and converges 
towards the unique solution of {11) in Lf^^{[0, 1] x R_|_) for any 1 < p < oo, as 
h 0; 



ueL^{{0,l) X (0,oc)), 

n i 

[(w - K)^(ft + sign^{u - k)(/(w) - f(K))(fix]dxdt 

nOO nOO 

+M / {u{t) — K)^(f{0,t)dt M / {u{t) — K,)^(p{l,t)dt 

Jo ^Jo 

+ J {uo — i^)^q^{x,0)dx > 0, 

\/ne [A,B], v/gC1([0,1] x [0,oo),R+). 



( 11 ) 



In (11)^ M is any bound for \ f'\ on [A, R] {and the solution of {11) does 
not depends on the choice of M). The definition of sign^ is: sign_^{s) — 1 if 
s > 0, signj^{s) =0 if s <0, sign_{s) = 0 i/s > 0, sign_{s) — —1 if s <0. 



Remark 4- 



1. It is interesting to remark that this convergence result is also true if the 
function g depends on i and n, provided that L is a common Lipschitz 
constant for all these functions. 

2. The definition (11) of solution of (5)- (6) with the “weak” boundary con- 
ditions u and u Sit X — 0 and x = 1 is essentially due to F. Otto, see 
[ 10 ]. 

3. It is interesting also to remark that if one replaces, in (11), the two en- 
tropies {u — K.)^ by the sole entropy \u — k,\, one has an existence result 
(since \u — k,\ = {u — k)'^ + {u — k,)~) but no uniqueness result, see [13] for 
a counter-example to uniqueness. 

4. This convergence result can be generalized to the multidimensional case, 
see Sect. 5 and [13]. 

If u, solution of (11), is regular enough (say u G C^([0,1] x R+), for in- 
stance), u satisfies u{0,t) = u{t) and u{l,t) = u{t) in the weak sense given in 
[1]. This condition is very simple if / is monotone: 

If /' > 0, then 'u(0, •) = u and u does not depend on u. 

If /' < 0, then u{l, ') = u and u does not depend on u. 



3.2 A very simple example 

One considers here Equation (5), with initial condition (6) and weak boundary 
condition u and u sit x = 0 and x = 1, that is in the sense of (11), in the 
particular case /' > 0. In this case, the main example of numerical flux is 
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9 — 9o — 9i, 9 {cl^ b) = /(a), which leads to the well known upstream scheme. 
With this choice of go and gi, using the notations of Sect. 3.1, the boundary 
conditions are taken into account in the form: 



=/K), (12) 

with it^ = ^ u{t)dt. One may apply the general convergence theorem. 

The approximate solutions converge (as h 0) towards the solution of (11). 
In this case, the approximate solutions, as well as the solution of (11), do not 
depends on u. 

In the case /' < 0 the main example is g = go — gi, g{ci,b) — /(6), which 
also leads to the upstream scheme. The boundary conditions are taken into 
account in the following way: 



/r - /(«?), = /(«"), (13) 

withl" = 

These simple cases suggest the following scheme for any /, which is the 
scalar version of the scheme described in Sect. 2.2 (note that f'{u) is the 
Jacobian matrix at point u eH): 

— Boundary condition at x = 0: 



f /r = f{^), if z'K) > 0, 

\ 4 = /K), if /'(w?) < 0. 

— Boundary condition at x = 1: 

J /;)+! = /(^). if /'(«iv) < 0, 

\ f^+1 = /(«iv), if rm > 0. 



(14) 



(15) 



This solution is not always satisfactory as can be shown on the following 
simple example with the Burgers equation: 

Let f{s) = 5^, uo = I a.e. on (0, l),u = l a.e. on R4- and u = —2 a.e. on 
R+ . The exact solution which has to be approached by the numerical scheme 
is the unique solution of (11) with these values of f, uo,u and u. Computing 
the approximate solution with (7)-(9), the function g satisfying (C2), and with 

(14) -(15), leads to an approximate solution which is constant and equal to 1 
for any h and k. Then it does not converge (as h and k go to 0) towards 
the exact solution which is not constant and equal to 1 since, for the exact 
solution, a shock wave with a negative speed starts from the point x = 1 at 
time t = 0. Indeed, one can also remark that this approximate solution is 
the exact solution of (11) with the same values of /, uo, u and with any u 
satisfying u > —1 a.e. on R_^. In order to obtain a convergent approximation 
of the exact solution corresponding to u = —2, a good choice is, instead of 

(15) , = 9i{u^, -2) with gi satisfying (C1)-(C3). 
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3.3 A simplifed model for two phase flows in pipelines 

It is now possible to understand the treatment of the boundary described 
in Sect. 2 on a simplified model. This simplifed model for two phase flows in 
pipelines is given in [12]. For this model, the densities are constant so that there 
are no longer pressure waves but only the void fraction wave, corresponding 
to the second eigenvalue of the original system (1). It is also easy to see that 
for this model, the total flux (that is the sum of the fluxes of the two phases) 
is constant in space. One also assumes that this total flux is constant in time 
(and positive). System (1) is then reduced to a scalar equation. Equation (5), 
where the unknown, u : (0, 1) x R ^ R, is the gas fraction which takes its 
values between 0 and 1. 

The function / can be taken as f{s) — as — bs‘^, where a, 6 G R are given 
and such that 0 < 6 < a < 26. In (5), the quantity f(u) is the flux of gas. 
Then /(I) — f{u) is the flux of liquid. The function / is increasing between 
0 and um — a /{2b) and decreasing between um and 1. An important value is 
Um C [0,um] such that f{um) = /(I). 

One takes uq — 0 a.e. on [0, 1] as an initial condition. At x = 0, the gas 
flux is given (as in the complete model, see Sect. 2), one takes /(r^(0, •)) = / 
with f{t) = c ioi t < T and /(t) = 0 for t > T, where c and T are given 
with c > /(I) and T large enough so that /' changes sign at rr = 1 during the 
simulation. Indeed, in this simplified model, it is also necessary to take T not 
too large in order to avoid a problem at x = 0 (for T too large, /' will also 
changes sign at x = 0). The boundary condition at x = 1 will be described on 
the discrete problem below. 

The discretization of the problem is performed as before with (7)- (9), with 
g satisfying (C1)-(C3). 

For the discretization of the boundary condition at x = 0, the method 
described in Sect. 2 leads here to 

/i=7(nfc), (16) 

which is indeed in accordance with the fact that f\ui) > 0 for all n, at least 
if T is not too large. 

For the discretization of the boundary condition at x = 1 , the first method 
described in Sect. 2 and given in (15), using the sign of f'{u^) leads to 

f /w+i = 7(“iv). if U%<UM, 

{ = /(I), if U%>UM, ^ ’ 

and does not lead to the desired results. Note also that given by (17), 

is a discontinuous function of u'/j. 

The second method, described in Sect. 2, uses the fact that the liquid flux 
cannot be negative at x = 1. Since the liquid flux at x == 1 is /(I) — /^v+i 
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since f{um) — /(I), this method leads to 



/ /;^+l = /K), if 
\ Jn+1 = /(Wm), if 

Note that given by (18), is a continuous function of u'^. We shall 

apply the convergence theorem, Theorem 1 given in Sect. 3.1, for the boundary 
conditions (16) and (18), and understand the boundary conditions satisfied by 
the limit of the approximate solutions. In order to do so, we need to find go 
and gi, satisfying (C1)-(C3), and u,u e L°°(R-|.) such that /J and /JJ, i, 
respectively defined by (16) and (18), satisfy (10). Indeed, it is shown in [7] 
that both boundary fluxes /J and , i may be expressed with the Godunov 
flux in the following way: 

— Boundary flux at a: = 1. One takes u = 1 a.e. on R+ and go equal to the 
Godunov flux, that is go = go with 



9G{oi,f3) 



min{/(s), s € [a,f3]} if a < (3, 
max{/(s), s e [/^, Q^]} if a> p. 



The formula (18) reads 



/ n 

iV+i 




if < Um 

if ^ 



(19) 



— Boundary flux at x == 0. One assumes (for simplicity) that e N. Let 
a,/3 G (0, 1) such that a < p and f{a) = f{P) = c. One takes 



u 




a if t <T^ 
0 if t > T, 



so that, recalling that ^ u{t)dt, 




c if nk < 
0 if nk > 



T, 

T, 



Then, if < /3, the formula (16) reads 



/r pg(^ 
2 






( 20 ) 



(21) 



since, in this case, gciu^^Ui) — f{u^). The fact that Ui<pis true for all n 
if T is not too large. If T is too large, the convergence result can be applied 
with (21) instead of (16). 

It is now possible to apply Theorem 1. Let L be a common Lipschitz con- 
stant for g and go (on [0, 1]^) and let ( > 0. If A: < (1 — C)^, the approximate 
solution Uh^k^ that is the solution defined by (7)- (9), with the boundary fluxes 
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(19)-(21) (and uq = 0, u = 1 and u given by (20)), takes its values in [0,1] 
and converges towards the unique solution of (22) in Lj^^^([0, 1] x R_|_) for any 
1 < p < oo, as h — > 0: 



uGL°^(( 0,1) X (0,oo)), 

poo pi 



n i 

[{u - + Sign±(w - k)(/(w) - f{K))(p^]dxdt 

rOO pOO 

+M / (u{t) — K)^(p{0,t)dt M / (1 — K)^(f{l,t)dt 

Jo ^ Jo 

+ lo (0 — 0)dx > 0, 

V/.G[0,1], V(^lci([0,l]x[0,oo),R+). 



( 22 ) 



If u, solution of (22), is regular enough on [0, 1] x (0, T), then, it is possible to 
prove that u satisfies the boundary conditions, for 0 < t < T, in the following 
sense (see [13] and [7]): 



— Boundary condition at x = 0 (recall that u is given by (20)): u(0,t) = a or 
^(0, t) > p. In fact, if T is not too large, one has u(0, t) = a. 

— Boundary condition at x = 1: < Um or — 1. 



Thanks to Theorem 1, it is possible to give other choices for for which 
the approximate solutions obtained with this new choice of , i converge 
towards the same function which is the unique solution of (22). Indeed, let 
h : [0, 1] — > R be a nondecreasing function such that h < f and h{l) = /(I) 
and take: 



= h{u%). (23) 

One may construct a function gi satisfying (C1)-(C3) such that h{s) = 
gi{s, 1), for all s € [0, 1], and then use Theorem 1. Let L be a common Lipschitz 
constant for g and go and gi (on [0, 1]^) and let C > 0- If ^ < (1 — C)^? fho 
approximate solution Uh^k-, that is the solution defined by (7)-(9), with the 
boundary fluxes (23) and (21) (and uq = 0^ u = 1 and u given by (20)), 
takes its values in [0, 1] and converges towards the unique solution of (22) in 
Lf„,([0 ,l]xR+) for any 1 < p < oo, as 0. 

Turning back to the complete system described in Sect. 2, the analysis of 
this simplified model for two phase flows in pipelines may also suggest another 
way to take into account the boundary condition at x = 1 (with a given 
numerical flux g{): 

1. Compute DF{w^), its eigenvalues {Ai, A2, A3} and a basis of R^, 

such that DF{w'^)(pi = Xicpi, i = 1,2,3. 

2. Write w'^ on the basis {(/pi, (/? 2, ^3}, namely w'^ = + ce2^2 + (ys^ps. 
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3. Since A3 < 0 and since one wants Qi > 0, compute + 

+ /?3V^3 and with the following 3 conditions 

on the components of usual condition on the pressure, f3s = as and 

R%^i = 1 where -R^y+i fraction computed with 



4 Two phase flow in a porous medium 



A second example is given by the modelization of a two phase flow, oil and 
water (for instance), in a porous medium. Phases are immiscible. Compress- 
ibility and capillarity effects are neglected. The model is obtained using the 
conservation of mass for each phase and Darcy’s law. This study is limited to 
the one dimensional case. In this case the pressure can be eliminated and the 
problem is reduced to a single equation, namely (5) with: 



.. . ^ fi{u)ia + f3f2{u)) 
’ fi{u) + f2{u) 



(24) 



The unknown is the saturation of one phase, say water, and is denoted 
by u. The quantity a is the total flux, which is constant in space, thanks 
to the incompressibility of the phases. One assumes also that it is constant 
in time and positive. The quantity j3 is the difference between the densities 
of the phases. The functions fi and /2 are the mobilities of the phases. The 
function fi is nondecreasing, regular and satisfies /i(0) 0. The function /2 is 

nonincreasing, regular and satisfies /2(1) = 0. The function fi -h/2 is bounded 
from below by a positive number. 



Remark 5. For the equivalent two or three dimensional model, the pressure 
cannot be eliminated and the resulting model is a coupled system of two par- 
tial differential equations and two unknowns (pressure and saturation). The 
problem to which the limit of the approximate solutions is solution is then 
much more complicated to determine. See [6] for a partial study of this ques- 
tion. 



Here again, an initial condition is prescribed, namely (6), with uq G 
L°°((0, l))^ 0 < uo < I a.e.. The boundary condition will be given later. 

The numerical scheme is as in Sect. 3.1; it is given by (7) and (8) with (9). 
The choice of the numerical flux, g, satisfying (C1)-(C3), is usually given, for 
this model, using an “upwinding phase by phase” , that is (see [2], for instance): 



g{a,b) 

g{a,b) 



/i(a)(a + /3/2(g)) 

/i(a{(a + /3/2(l)) 



if 



if 



■ a + /3fi{a) < 0 
a + f3fi{a) > 0. 



(25) 



fi{a)+f2{b) 

Let us then define /7 and • On considers here the case of an injection 
of pure water at x = 0. Then: 
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/? - a, n > 0. (26) 

2 

At a: = 1, The boundary condition is quite complicated. A simple example 
is (see [7] for a more complete study): 

fn ^ /i(»Jv)Q; 

/iK) + /2K)' ^ ’ 

Then the approximate solution is given with (7)- (9), g given by (25), and 
(26)-(27). 

In order to prove that the approximate solutions converge, as h and k go 
to zero, and to determine the problem for which the limit of the approximate 
solutions is its unique solution, one proceeds as in Sect. 3.3. One has to find 
go and gi satisfying (C1)-(C3) and u^u E L°^(R_I_) such that fi and /JJ , i, 
respectively defined by (26) and (27), satisfy (10). This is again performed in 
[7]. The most interesting case is obtained for jSf i(l) > a and when the function 
/ is increasing on (0, um) and decreasing on {um, 1), as in Sect. 3.3. In fact, the 
main point is the existence of a unique Um C (0, l)such that f{um) — /(I) = 
and that / is increasing on [0,Um] and greater or equal to a on [um, 1]. Then 
it is quite easy to prove that (26) yields 

/J = a = gcium^u^), 

2 

where go is the Godunov flux given in Sect. 3.3. 

For the boundary condition at x = 1, it is possible to construct (see [7]) a 
function gi : [0, 1]^ ^ R satisfying (C1)-(C3) such that (27) gives: 



/tv+1 — !)• 

It is now possible to use Theorem 1. 

Let L be a common Lipschitz constant for g (given by (25)), go and gi (on 
[0, 1]^) and let C > 0. If A: < (1 — C)^, the approximate solution that is the 
solution defined by (7)- (9) (with g given by 25), and by the boundary fluxes 
(26)-(27), takes its values in [0, 1] and converges towards the unique solution 
of (28) in LL([0, 1] x R+) for any 1 < p < oo, as ^ 0: 



u G L°°((0, 1) X (0, oo)), 

n i 

[(w - + sign±(M - k)(/(u) - f{K))<fix]dxdt 

pOO pOO 

-\-M / {um — M / {1 — K)^(p{l,t)dt 

Jo ^ Jo 

+ J {uq — tz)^(p{x^0)dx > 0^ 

V/.G [0,1], v/GCi([0,l] X [0,oo),R+), 



( 28 ) 
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where M is a bound for \f'\ on [0,1] (/ is given by (24). As in Sect. 3.3 it 
is possible to give the sense of the boundary condition if u is regular enough. 
Indeed, let u he a. regular solution of (28). Then u satisfies the boundary 
conditions in the sense given by [1], that is: 

sign(u(0,t) - Um){f{u{0,t)) - f{K)) < 0, Vk G [um,u{0,t)], for a.e. t e R+, 

sign(?i(l,t) — — /(/^)) > 0, V/^ G [l,ii(l,t)], for a.e. t G R+, 

with [a, b] = {ta -f (1 — t)b^ t G [0, 1]} and sign(s) == 1 for s > 0, sign(s) == — 1 
for s < 0, sign(O) = 0. 

This gives i^(0,^) = Um or u(0,t) = 1 and u{l,t) < Um or u{l,t) — 1. 
In particular, at a: = 0, one has f{u{0,t)) = a (only water is injected) and, 
at X = 1, f{u{l,t)) < a if u{l,t) < Um (which states that there is some oil 
production). 



5 The multidimensional scalar case 

In this section, a generalization of Theorem 1 is presented for the multidi- 
mensional scalar case together with a rough sketch of proof. For the sake of 
simplicity, one considers d == 2 (the extension to d = 3 is straightforward) and 
a fiux function under the form u(rr, t)f{u)^ with div(t’(-, ^)) = 0 (see [13] for the 
general case of a fiux function f{x^t^u)). This leads to the following equation: 

ut -f diY{v f{u)) == 0, in i? X (0, T), (29) 

where i? is a bounded polygonal open set of R^, T > 0, / G C^(R, R) (or 
/ : R — » R Lipschitz continuous) and v G x [0,T]) R^ with 

div(u(-, t)) — 0 in R^ for all t G [0, T]. The unknown is u : x (0, T) R. 

Let uo G L°°(i?) and u G x (0,T)). Let A, R G R be such that 

A < uq < B a.e. on 17 and A <u < B a.e. on df2 x (0, T). Following the work 
of [10], an entropy weak solution of (29) with the initial condition uq and the 
(weak) boundary condition is a solution of (30): 



u G L^(17 X (0,T)), 

n [{u - + sign±(rt - K,){f{u) - f{K.))v • giad(f]dxdt 

-\-M / / {u{t) — K.)^(p{x,t)d^{x)dt 

Jo JdQ 

+ / {uq — ti)^(p{x^d)dx 
G [A,R], % G Cl(n X [0,T),R+), 



(30) 



where d7(x) stands for the integration with respect to the one dimensional 
Lebesgue measure on the boundary of 17 and M is such that 
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||'i^||oo|/(5i) - /(52)| < M\si - S 2 I for all si,S2 G [A,B], 

where ||f||oo = ^^P(x,t)ef2x[o,T] I ' I denotes here the Euclidean 

norm in R^). 

Remark 6. 



1. If satisfies the family of inequalities (30), it is possible to prove that u is 
a solution of (29) (on a weak form), u satisfies some entropy inequalities in 
i?x (0,T), namely |ia — /^|t + div(u(/(max(iA, /^)) — /(min(ii, /^)))) < 0 for all 
K eH, but also on the boundary of OH and on t = 0. satisfies the initial 
condition (t^(-,0) — i^o) and u satisfies partially the boundary condition. 
For instance, if /' > 0 and u is regular enough, then u{x,t) = u{x,t) if 
X G Of}, t G (0, T) and v{x, t) • n(x, t) < 0, where n is the outward normal 
vector to 

2. Let M > 1. It is interesting to remark that u is solution of (30) if and only 
if u is solution of (30) where the term /q{uq — k)^<^{x^ Pi)dx is replaced by 
M f^(uo — K)^(p(x,0)dx. 

A sketch of proof of existence and uniqueness of the solution of (30) together 
with the convergence of numerical approximations is now given, following [13]. 

Step 1: Approximate solution. With a quite general mesh of i? (with 
triangles, for instance), denoted by T, and a time step k, it is possible to 
define an approximate solution, denoted by UT,k, using some numerical fluxes 
(on the edges of the mesh) satisfying conditions similar to (C1)-(C3) in Sect. 
3.1. Under a so called CFL condition (like k < (1 — C)^ in Sect. 3.1), it is 
easy to prove that A < UT,k ^ B n.e. on i? x (0,T). Unfortunately, it does 
not seem easy to obtain directly a strong compactness result on the familly of 
approximate solutions (alhough this strong compactness result is true, as we 
shall see below). 

Step 2: Weak compactness. Using only this bound on one can 
assume (for convenient subsequences of sequences of approximate solutions) 
that UT,k u, as the mesh size goes to zero (with the CFL condition), in 
a “nonlinear weak-^ sense” (similar to the convergence towards young mea- 
sures, see [4] for instance), that is u G L°°(i7 x (0,T) x (0, 1)) and 




g{u(x, t, a))(f{x^ t)dxdtda^ 
for all (/? G L^(i7 X (0,T)). 



Step 3: Passing to the limit. Using the monotonicity of the numerical 
fluxes, the approximate solutions satisfy some discrete entropy inequalities. 
Passing to the limit in these inequalities gives that u (defined in Step 2) is 
solution of some inequalities which are very similar to (30), namely: 
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u e L^{f2 X (0,T) X (0,1)), 




[{u — n)^cpt + sign^(i^ — K){f{u) — f{K))v • gidid(p]dxdtda 
+M f [ (u{t) — K)^(p{x^t)d'y{x)dt 

Jo JdQ 

+ / {uq — K)^(p(x^0)dx > 0, 

Wkg[A,b], Vvpeci(7?x [0 ,T),r+). 



( 31 ) 



For this step, one chooses M not only greater than the Lipschitz constant 
of \\v\\oof on [A, 5], but also greater than the Lipschitz constant (on [A,B]‘^) 
of the numerical fluxes associated to the edges of the meshes (the equivalent 
of L in Theorem 1). This choice of M is possible since the unique solution 
of (30) does not depend on M provided that M is greater than the Lipschitz 
constant of ||t’||oo/ on [A, B] and since it is possible to choose numerical fluxes 
(namely, Godunov flux, for instance) such as the Lipschitz constant of these 
numerical fluxes is bounded by the Lipschitz constant of ||u||oo/ (then, the 
present method leads to an existence result with M only greater than the 
Lipschitz constant of ||^^||oo/ on s G [A, J5], passing to the limit on approximate 
solutions given with these numerical fluxes). 

Step 4: Uniqueness of the solution of (31). In this step, the “dou- 
bling variables” method of Krushkov is used to prove the uniqueness of the 
solution of (31). Indeed, if u and w are two solutions of (31), the doubling 
variables method leads to: 



\u{x^ a) — w{x^ t, /3)\(pt dxdtdad/3 
/ {f{msix{u,w)) — f{mm{u^w)))v • gr nd(p dxdtdad/S > 0 

J Q 

[0,T),R+), 

Taking (f{x^t) = (T — t)'^ in (32) (which is, indeed, possible) gives that u 
does not depend on a, v does not depend on j3 and u — v a.e. on i? x (0,T). 
As a result, u is also the unique solution of (30). 

Step 5: Conclusion. Step 4 gives, in particular, the uniqueness of the 
solution of (30). It also implies that the nonlinear weak-^ limit of sequences 
of approximate solutions is a solution of (30) and, therefore, it guarantees the 
existence of the solution of (30). Furthermore, since the nonlinear weak-^ limit 
of sequences of approximate solutions does not depend on a, it is quite easy 
to deduce that this limit is “strong” in L^(i? x (0,T)) for any p G [1, oo) (see 
[4], for instance) and, thanks to the uniqueness of the limit, the convergence 
holds without extraction of subsequences. 
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Summary. This paper deals with a class of 2D shape optimization problems with 
a ’flux’ cost functional and a flctitious domain formulation of state constraints. These 
constraints are given by nonhomogeneous Dirichlet boundary problems in bounded, 
doubly connected domains. This approach is used for the numerical realization of 
free-boundary problems of Bernoulli type. 



1 Introduction 

This paper deals with a particular shape optimization problem with a fictitious 
domain (FD) formulation of the state equation. Solvers which are based on 
FD formulations represent nowadays one of efficient tools for solving large 
scale algebraic systems arising from discretizations of state problems. The 
fact that the new FD problem is solved in a domain Q. with a simple shape 
(a box, e.g.) enables us to construct uniform partitions of ft and consequently 
to use fast solvers and special preconditioning techniques. FD solvers have 
additional advantages when used in shape optimization. To see that let us 
recall the standard approach in shape optimization which is based on boundary 
variations of admissible domains. Let us suppose that a linear state problem is 
solved by a standard finite element method and a gradient type method is used 
for the minimization of the cost functional. Then the following steps have to be 
performed after every change of the shape: (i) remeshing the new configuration] 
(ii) assemhling the new stiffness matrix and the right-hand side of the linear 
algebraic system] {iii) solving this new system. As a result the computational 
process is not efficient. As we shall see, FD solvers utilizing nonfitted meshes 
completely avoid step (i) and partially avoid step (ii) since the stiffness matrix 
remains the same for every admissible domain. The FD formulation that we use 
in this paper is based on the dualization of Dirichlet conditions by boundary 
Lagrange multipliers ([6], [8]). It turns out that this variant is appropriate in 
shape optimization problems for the following reasons: the Lagrange multiplier 
being part of the solution is equal to the conormal derivative on the searched 
boundary of the solution to the original state problem. The conormal derivative 
of the state appears in expressions for the shape derivative of cost functionals 
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([18]). In addition, the computed Lagrange multiplier is used when evaluating 
our particular cost functional. 

The paper is organized as follows. In Section 2 the optimal shape design 
problem with the “flux” cost functional is defined and analyzed. State prob- 
lems are given by Dirichlet boundary value problems in doubly connected, 
bounded domains. To avoid difficulties with shape dependent function spaces 
and for numerical purposes we use the above mentioned FD formulation of 
the state problems. We introduce appropriate assumptions on the family of 
admissible domains which guarantee the existence of optimal shapes. Conti- 
nuity of solutions to the FD formulation with respect to domain variations 
which plays a key role in the existence analysis is the main result of this sec- 
tion. To avoid the evaluation of the dual trace norm which defines the cost 
functional we give an alternative setting of the problem in which the standard 
iLo(0)-norm is used for expressing the cost functional. Section 3 deals with 
computational aspects of this approach. We present a finite element discretiza- 
tion of the FD formulation. Then we shortly describe the modified controlled 
random search (MCRS) method, i.e. the gradient free global minimization 
method of the stochastic type which will be used for minimizing the cost func- 
tional. For more details on computational aspects we refer to [11] and [9]. 
Finally, in Section 4 several Bernoulli free-boundary problems will be solved 
using our approach. Both the gradient and MCRS methods will be used for 
the numerical minimization of the cost functional. 

2 Setting of the problem, existence analysis 

In what follows we shall consider the following optimal shape design problem: 



where: 



(9 is a family of admissible domains which consists of doubly connected 
domains in contained in a box D. The components of the boundary dco 
are denoted by Fq and F/(a;). We shall suppose that Fq is fixed and the 
same for all a; G (9 while F/(c<;) is variable and defines the shape of lj (in our 
presentation Tf{uj) is exterior to Fq but one can consider also the opposite 
situation); 
du(^ 

Dirichlet state problem in uo: 



denotes the conormal derivative of u on where u solves the 



' — div {AVu{(jj)) = f in cj, 

on To, (^(w)) 

u{u>) = 0 onr/(w), 



< 
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with / e 1/^(0), g e a G x R^) satisfying: 

3a > 0 : ^ > a||^|p G R^, a.e. in Q. 



The symbol || ||_i/2,r/(a;) stands for the dual norm in H f{uj)). 

Remark 1. Problem (P) is closely related to Bernoulli free-boundary problems. 
Indeed, let there exist a;* G O such that 



JK) 



1 udu{uj*) 

2 II dvA 



-Qll 



\/2,Tf{uj*) 



= 0. 



Then simultaneously u((x7*) — 0 and ^ == Q on T /(cj*) which is a typical 

system of boundary conditions satisfied on free boundaries. A shape optimiza- 
tion approach can be used for the numerical realization of this type of prob- 
lems. To illustrate how we proceed let us consider the exterior Bernoulli free- 
boundary problem (see [3]): for Q < 0 given, find u* E O and : cj* — » R^ 
such that 

{ Aii* =0 in a;*, 

= l on To, (1) 

u*=0,^=Q onTfioj*). 

For (j £ O given a-priori, problem (1) is ill-posed due to the conditions on 
Tf{uj). A possible way how to solve (1) is to skip one of the conditions on 

Tf{uj) (say = Q). The rest of the problem is now well posed and defines 
the state problem (V{uj )) . The remaining Neumann condition on T /(a;) will be 
satisfied by minimizing the cost functional J. If ujopt G (9 is a solution to (P) 
such that J{iOopt) = 0 then uj* = uOopt solves (1). Let us mention however that 
the assumptions on O which are specified below do not automatically ensure 

J (^opt) ~ fi' 

Next we shall closely follow [10] where detailed proofs of all results can be 
found. 



2.1 Parametrization of shapes 

We shall suppose that the outer components F /(cj) of the boundary co e O 
are parametrized by 27r-periodic functions 7 : [0, 27 t] — ^ R^. We shall use the 
following notations: Tf{cj) := meaning that T f{co) is the range of 7. Further 
Uj denotes the doubly connected domain between Fq and The family of 
admissible domains can be specified by a class S to which 7 belongs. 

Definition 1. A function 7 : [0,27 t] ^ R^ belongs to S if: 

(Si) 7 G CItt; i-e. 7 is 27T-periodic, twice continuously differentiable on [0,27 t]; 
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(S'2) 3a,7i,72 > OVi e [0,27 t] : \'y'{t)\>a, ||7'||oo < 7i, ||7"lloo < 72/ 

[S^) 7 is positively oriented; 

(54) = (0,1)2; 

(55) 3d > 0 : dist (Fq, T^) > d, dist (F-y, dfl) > d; 

(Sq) 3h > 0 V7 G <S Vt G [0,27t]; 3Bi, Bq open discs of radius h such that 
Bi BoCn\ 7(t) eBiH Bo. 

Remark 2. Assumption (Sq) means that there exists a tubular neighborhood 
of F^ of uniform thickness for all 7 G <S (see Fig. 1). 




Fig. 1. Geometry of the Bernoulli problem 



2.2 The fictitious domain formulation of 

The unit square IQ which contains for all 7 G will serve as the fictitious 
domain in the FD formulation. Let us introduce the following spaces: 

Vg = {v e Hq{CI) I V = g on Fq, u = 0 on F^}, 

Vq :=z Vg with ^ = 0. 

Instead of (V{cu^)), 7 G fixed, we formulate the following problem on Q.: 

Find ueVg : a{u, v) = (/, u)o,n Vu G t^o, (2) 



where 




AVu • Vu dx 



and / is the zero extension of / from co^ into 
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Remark 3. It would be possible to use any continuous extension of / outside 
of but the zero extension is of a special significance. 

It is easy to see that (2) has a unique solution u. In addition, u\ solves 

\<jjy 

(PK)). 

Now we may look at the conditions on Pq and P^ as constraints which 
will be released by means of the Lagrange multipliers belonging to iP“^/^(Po), 
^-i/ 2 (p^)^ respectively. The FD formulation of {V{^'y)) reads as follows: 



(^K)) 



' Find {u,Xo,Xy) G H^{n) X x F-i/2(r^) g.t. 

a{u,v) - {Xo,Tov)ro ~ = {f,v)o,o. Vv e 

(/io,Tou)ro + {di,ryu)r^ = {no,g)ro 

v(mo, M 7) e X 



Here tq : Hq{Q) — > : Hq{Q) — » stand for the trace 

mappings and (, )ro? (? )r^ denote the respective duality pairings. Using re- 
sults of [2] one can easily prove 



Theorem 1. There exists a unique solution (li, Ao,A^) to In addi- 
tion, u := solves (V{uj^)) and q^i 



Remark 4- The equality A^ — on P^ is due to the zero extension of / 
outside of cuy. For any other extension one only knows that A^ = 
where [ ]r^ denotes the jump of the corresponding quantity across P^ 






The previous result motivates us to consider (P) in the following form: 
mm ipT, - (P) 

where A^ G is the last component of the solution to and 

Sl C S is 8i compact subset of «S. 



2.3 Periodic function spaces on [0, 27 t] 

To prove the existence of optimal shapes in (P) one has to show that the 
solutions to {V{(jJy)) depend continuously on variations of 7 G S. One of the 
difficulties that we face in any shape optimization problem is the fact that the 
functions have their own, variable domain of definition. To handle this difficulty 
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we pass to reference, shape independent function spaces. This will be done for 
the trace space and its dual iJ“^/^(F^). Using the parametrization 

of we shall replace by the reference space and 

by 

The space of 27r-periodic, square integrable functions will be denoted by 
L\^. The reference trace space is defined as follows (see [13]): 



= ||^|l/2.2x<00}, 



where 



llv^ll 



2 

1/2, 27t 



27T 27T 

:= M%, + j j 

0 0 



W)-y{s)? 

|sin((i-s)/2)|2 



dtds. 



Next we shall define the trace mapping Ty from Hq{Q) into the reference space 
ny‘^. To this end we introduce the following spaces on F^, 7 G <S: 



= {cp€ L\r^) I o 7 G 

with the norm 

||V^||l/2,p — IIV^ O 7lll/2,27r 

and the standard Sobolev space iJ^/^(F^) endowed with the norm ([15]): 



ii<^f 



1 / 2,7 



:= Ibll 



L2(r^) 



+ 



// 



r,, r-. 



\x-y? 



dsx dsy — 



p2tt 

Jo 



dt 



2tv r2TT 



+ 



pZ'K p 

Jo Jo 



\v o ijt) - v o 'r{s)\'^ 



\'y'{t)\\'y'{s)\dtds 



making use of the parametrization of F,y. The relation between and 

follows from the next lemma. 

Lemma 1. The spaces Hy^{T-y) and coincide as sets and are topo- 

logically equivalent, uniformly with respect to ^ ^ S, i.e. there exist positive 
constants ci, C 2 such that 



Ci||v^||i/2,7 < II^0 7||i/2,27t < C2\\(p\\l/2,j 
holds for every ip G iJ^/^(F-^) and every 7 G <S. 



Let ij : iJ^/^(r^) iJp^^(F^) be the identity mapping and ly : 

iJp^^(F^) — ^ be an isometry defined by I^i^p) := (p o j. Then the trace 

mapping Ty : Hq (Q) H.yy is given by 

' J'y • — Ty O O 77 • 

The trace mapping 7^ enjoys properties which are useful in the existence anal- 
ysis for (P) . 
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Lemma 2. 1) There exists a constant c> 0 such that 
2) If In -^1 in C\^, 7„,7 € S, then 

in 

holds for every v G Hq (Q) . 



(3) 

(4) 



The mapping Ty maps Hq{Q.) onto B.y^ • This follows from 

Lemma 3. {Inverse trace property) There exists a continuous extension map- 
ping Hq{Q) such that 

= ip Vy> 6 

and 

where c > 0 does not depend on j G S. In addition, supp^^(/? is contained in 
the h-neighhorhood ofV^. 

For the proof of the previous lemmas we refer to [10]. 



In a similar way one can associate with any G H a unique 

element G 

(M7- = </“7> Iyh<p)27r = (/i-y, V? O 7 ) 2 ^ e Fi/2(r^), 



where jl^ := *ii^ and ( , )27 t denotes the duality pairing between 

and Hli\ 

This enables us to rewrite problem {V{uj^)) in the following equivalent 
form in which only shape independent function spaces appear: 

' Find (u,Xo,X.y) € H^{a) X F-i/2(ro) x s.t. 

a{u,v) - {Xo,Tov)ro - {X^,T^v)2t^ = (/,v)o,a Vv e 

< 



(^K)).e/ 



iho,Tou)ro + {h,Tyu)2Tv = (^o,5)ro 

V(/xo,/i) € X 



{Q,‘P)r^ 




poy |7'| dt = {Q\'l'\,p O 7)2^, 



Since 
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we obtain the following transformation of (P) : 



7G<Sl 



in 



h: 



' ref 



where G is i^st component of the solution to (P{uj^))^ 



ef 



2.4 Existence analysis for 

As we have already mentioned, the continuous dependance of solutions to 
(V{uj^)) with respect to 7 € <S plays a central role in the existence analysis. 
To prove this property we shall need the following auxiliary results. 

Lemma 4. The set of solutions A07, A^), ^ ^ S} to (^(^7))^^^ 
bounded in H^{n) x x 

Lemma 5. {Stability of the Dirichlet boundary condition) Let 7 in C 2 -n, 
7n, 7 ^ nnd let {un} be any sequence in (f^) such that ToVn = g, = 0- 

If ^ y 'In Hq{^)^ then tqv — g and t^v = 0. 

Continuity of solutions to {V{uj^))^^^ with respect to 7 G <S now follows 
from the next theorem. 

Theorem 2. Let 7^ 7 m ^ S and (unAonAn) ^ Hq{^) x 

jj-i/2(Fo) X be the solution to (V{ijj^^)) T hen 

' Un u in mm, 

< ^On •^o ‘in (6) 

in 

and {u, Ao, A7) is n solution to (V{wy))^^^. In addition, if — > 7 in then 

A„ ^ A^ in ( 7 ) 

Proof. By Lemma 4 we may assume that there exists a subsequence of 
{{un, Aon, An)} (denoted by the same symbol) such that 

(Un, Ao„, A„) - («, Ao, A^) in x ^-^^(ro) x (8) 

From Lemma 5 it follows that tqu = g^ r^u = 0 so that the second equation 
in {V{cjJj)) is satisfied. Let v G Hq{Q) be fixed. Then 

lim {Xn,T^m2n = lim {a(u„,{)) - (Ao„,rov)ro - {f,v)o,n} 

n— >oo n— >00 

= a{u,v) - {\o,Tov)ro ~ (/,v)o,n 



( 9 ) 
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as follows from (8). On the other hand 

lim (A725 ^T^^'l^)27r ~ lim H~ (An? "^'^)27r)‘ ~ (^^7 5 ^^^)27T? 

n — >00 n — >-oo 

making use of (4) and (8). From this and (9) we see that ('u,Aq,A^) solves 
{V{(jO ^)) In addition, the whole sequence in (8) tends to (-u, Aq, A^). Strong 
convergence of {un} follows from 

a('Un, Un) = (Aon, ^)ro + (A '^n)o,n > (^0, P)ro + (/, u)o,Q = «('U, u) 

using that ToUn = g and Un — 0. 

Let (p G iJ^/^(ro) be given and v G Hq{Q) be its continuous extension: 
tqv = (/?, ||u||i,n < c||(/?||i/2,ro suppu being in a ^-neighborhood of Fq, 

S < d. Then 

|(Aon - Ao,V?)rol < ||a|| -^||l,n|l^||l,n < c||Un - '^||l,f^||(/^||l/2,ro 

yielding strong convergence of {Aon} to Aq. For the proof of (7) which is more 
technical we refer to [10]. □ 



It remains to specify a compact subset of S. Let L > 0 be given and 
define 



<5i = {7 e<S| \Y'{t)-Y'{s)\<L\t-s\ Vt,s € [0,27 t]}. (lo) 

Theorem 3. Let be defined by (10). Then has a solution. 

The proof follows from Theorem 2 and continuity of the cost functional. 



The dual norm defining the cost functional is hard to evaluate. For this 
reason we use another functional which also measures the distance between 
and the target Q and which is easy to compute. To this end we introduce 

the new norm in 



Mi/ 2 , inf 

TyV = ip 

It is easy to show that the respective dual norm is given by 



( 11 ) 



where i is the solution of the transmission problem: 



(12) 



(Ah)) 



Find z := z{fj.) £ Hq{Q,) such that 
(Vi, Vw)o,a = (m, ■ i))r, Vt) e 
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This leads to a new definition of the shape optimization problem (P) : 

(P) 

where z(u{uj)) solves (^(/i)) with jji = — Q. Observe that z{u {to)) — 0 if 

and only if = Q, i.e. the zero value (if any) of the cost functionals in (P) 
and (P) is realized by the same shapes. 

The mathematical analysis for (P) is done in [9]. It is worth noticing that 
the existence of optimal shapes in (P) is ensured under weaker assumptions on 
O. Instead of in (P) one needs boundaries in (P), only. This is due 
to the fact that the cost functional in (P) is expressed by a volume integral. 

2.5 Computational aspects 

In this section we briefly describe the discretization of (P) . For more details we 
refer to [9]. Instead of the family of admissible domains O we shall introduce 
a new family which contains domains with boundaries defined by a finite 
number d := d{K) of parameters, e.g. splines. For lJk, G O^, given we define 
a discretization of as follows: 

' Find X x s.t. 

a{uh,Vh) - [^Ho,Vh]o - = if,Vh)o,Q e 14, 

G Aho X Ah^. 



Here 14 is a discretization of Ho(O), Ahq, Ajj^ are appropriate discretiza- 
tions of iJ~^/^(Fo), respectively, g is an approximation of g and 

[ , ]o, [ , ]'y are approximations of the duality pairings ( , )ro , ( , )p.y ’ respectively. 
The symbols /i. Ho and Hj denote the norms of the respective partitions used 
for the construction of 14, Ahq, Ah^, respectively and H := {Ho, H^).ln what 
follows we shall suppose that the partition Th of Q characterizing 14 does not 
depend on the geometry of 

To ensure the existence and uniqueness of solutions to {F{^k))^ we shall 
need the following stability condition: 

[l^Ho,Vh]o + [l^H^,Vh]y = 0 Vv/J G V/J HHo = = 0. (5) 
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For the detailed mathematical analysis of ^ in particular for the ver- 

ification of the LBB-condition we refer to [6]. Having Xh^ at our disposal we 
define the discretization of (^(cj/e)): 



{ Find Zh Zh{XH^) G such that 
{Vzh,Vvh)o,a = [Ai/^ - Q,vh\t G Vh. 

Finally, the discretization of (P) reads as follows: 






(P) 



H 

Kh 



where Zh{\H^) solves 

To see better the structure of the discrete problem we now present its 
algebraic form. Any G is characterized by a vector G of discrete 
design variables. The family can be identified with a compact subset U C 
For a given k gU we first solve the saddle-point problem: 

' Find (u, Ao, A^) G X R"^^ x R^^^ s.t. 

< Au - Bq Ao - B^(k:)A^ = f(/c), (^(^)) 

^ Bo u = g, B^(a^)u = 0, 

then the linear system: 

Az{k) = B^(k)(A^ - Q), (^(^)) 

where A is the stiffness matrix and Bo, B^(^c) are matrices coupling the primal 
variable u with the Lagrange multipliers Aq, A^, respectively. Notice that only 
the matrix B^ and the right hand side f depend on k but not A! Therefore, A 
needs to be assembled only once. We finally arrive at the following non-linear 
mathematical programming problem: 

mini(Az(«),z(K))]Kn, (p)^ 



where z(k) solves (v4(/^)). 

A traditional way of solving (R)^ is based on derivative information. It is 
known from the theory that fictitious domain methods which use non-fitted 
meshes may reduce the smoothness of the control-to-state mapping (see [11], 
[12]). In addition, the classical gradient minimization techniques are local 
methods. To obtain a global minimizer one should use global minimization 
methods which do not need any gradient information. For our class of prob- 
lems we use a stochastic type method, namely the modified controlled random 
search (MCRS) algorithm [14]. 
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This algorithm uses ideas of the simplex method [16] and the CRS (Con- 
trolled Random Search) algorithm [17]. It starts with a population P of 
N points {N ^ n) which are chosen at random in the search space X 
{n := dim(A:’)). A new trial point x is generated from a simplex S (a set 
of n + 1 linearly independent points of a population P in X) by the following 
operation called reflection: 



x = g-y(z-g), 



where z is one (randomly chosen) vertex of the simplex 5, g is the center 
of gravity of the remaining n vertices of the simplex and T is a randomized 
multiplicative factor. Thus the new point x is obtained from the reflection of 
the point z with respect to g. Let Xmax be the point with the largest objective 
function value among the N points currently stored in P. If /(x) < f{Xmax) 
then Xmax •= X, i.e. the worst point in the population is replaced by the new 
trial point. The process continues until a stopping criterion is fulfilled. 

The main modification of the original CRS algorithm consists in randomiz- 
ing the multiplicative factor Y. The best results in most tested examples were 
obtained with Y distributed uniformly in [0, a[ with a ranging from 4 to 8 and 
N = max(5n, n^). For more details on this algorithm we refer to [14]. 



3 Applications in free-boundary problems 

In this section we show that the shape optimization problem (P) with the 
FD solver of the state problem represents an efficient computational tool for 
solving a large class of free-boundary problems. Since free-boundary problems 
are very well investigated from the theoretical point of view they may serve as 
benchmarks for testing the reliability of the method. In what follows we use 
our approach for solving exterior and interior Bernoulli free-boundary (BFB) 
problems and a dam problem. 

3.1 Bernoulli free boundary problems 

We shall be concerned with the exterior as well as interior BFB problem. 
Unlike to the exterior BFB problem defined by (1), the unknown component 
F /(ct;*) of the boundary is interior to Fq in the interior BFB problem and the 
respective boundary conditions on Fq and F /(a;*) read as follows (see [3]): 

li* = 0 on Fo, 

o * (13) 

u* = l, ^=Q on Tfioj*), 

where Q is a positive constant. Since the Dirichlet condition on F /(u;*) is non- 
homogeneous, one has to introduce the new variable u := u — 1 in order to 
ensure that ^ on r/(a;*). 
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The discrete family consists of all doubly connected domains whose 
variable component of the boundary is realized by piecewise second order 
Bezier curves. The discretization parameter k is related to the number of 
Bezier segments. The space Vh C is realized by continuous, piecewise 

/inear functions over a uniform partition ThoiVt. Further, Kh^ are spaces 
of piecewise constant functions over partitions Thq , Th^ of polygonal approxi- 
mations To, 1^7 of Fo, r^, respectively (see Fig. 2). Finally, the duality pairings 
[, ]o, [, ]7 are realized by the L^(Fo), L^(r,^)-scalar products, respectively. In 
order to satisfy the LBB-condition, the partitions Thq , Th^ are constructed in 
such a way that Hq = 3/i, = 3h. For more details on the practical realiza- 




Fig. 2. FE partitions 



tion we refer to [9], [11]. Both external and internal problems were computed 
with the following data: ft — (0, 10)^, h = 10/64, the number of Bezier seg- 
ments K = 10 and Fo is L-shaped. The examples are computed for different 
values of the target Q by using two cost functionals, namely: 

(a) Ji (a;) as in the definition of (P) ; 



(6) J2(w) = 

ll^'ll-iAf^ = 11^ ° 711-1/2, [0,1] and 



- 1 / 2 , [ 0 , 1 ] 



ICnl 


2 


1 + 


n[ 



Jo 



-‘Z'Kint 



dt. 



Here 7 denotes a piecewise linear parametrization representing F^. 



The results obtained by the MCRS method after 2000 function evaluations are 
shown in Figs. 3-4. 
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Variant (a) Variant (6) 

Fig. 3. Exterior BFB 




Variant (a) Variant (6) 

Fig. 4. Interior BFB 

3.2 A gradient approach 

In this subsection we outline a gradient based approach to the optimal shape 
design problem (P) with the iJ~^/^(jT/)-norm replaced by the L^(//)-norm. 
A rigorous foundation of the following arguments as well as a detailed numer- 
ical analysis will be carried out in a future paper. For simplicity we consider 
the following variant 

i I -Ql^da, (14) 

where denotes the outer normal derivative of the solution to 








70 



J. Haslinger et al. 



—Au = 0 on cj, 
u = g on To, 



^ ix = 0 on ry 



(15) 



with Fq and Ff as in Section 2. We assume that the searched component of 
the boundary Fp Ff of every admissible domain p G 77, is the union 
of a fixed number k of adjoining arcs Fp^i, Fp = U^^^Fp^i^ such that each arc 
Fp^i can be described by a quadratic Bezier curve. More precisely, given an 
ordered set of distinct control nodes xi, . . . Xi = (^^,77^), i = 1 , . . . , fc, 
the z-th Bezier arc Fp^i is determined by the triple m^) with = 

H-Xi+i), i = 1, . . . ,k. Here we set Xk-\-i = xq and mo = mk. Thus Fp^i can 
be parametrized by 

ji{t) = + 2xit{l -t) mit^, t G [0, 1]. 

Defining B: [0, 1] (a (1x3) matrix), t ^ “ ^)^’ ^ “ ^)5^^] 

and 

/^i_i Pi-i\ 

= 6 Vi \ 

this parametrization can be compactly written as 

= B{t)Xi, tG[0, 1], i = l,...,k. 

Shifting the parameter interval 

== Bi{t)Xi, t e[i-l,i], z = 1 , . . . , /c, 

one obtains a parametrization 7: [0, A:] — » R^ of Fp defined by = ji. 

Thus each index p G R^^^ corresponds to an ordered list of coordinates of 
control nodes and 77 C R^^^ describes some a priori assumptions on the 
location of the control nodes. 

Let J{p) denote the value of the cost functional defined in (14) computed 
by solving (15) on cjp corresponding to the configuration p of the control nodes. 
The Gateaux derivative of J at p in the direction Sp is defined by 

J'(p)Sp = lim i(J(p + sSp) — J(p)). 

s-^0+ s 

Proceeding formally one can derive the following representation 

J'{p)5p =- F “ Q^)ada. ( 16 ) 

Above OL describes the normal component of the displacement of Fp caused by 
the perturbation bp of the configuration p and k stands for the curvature of 
Fp. The auxiliary variable p, is the solution of the adjoint equation 
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' — = 0 on iOp^ 

< M = 0 on To, ( 17 ) 

^ on Tp 

and u is the solution of (15) on ujp. Let us introduce the shorthand notation 

J'{p)Sp = J^^JCada (18) 

and let 5Xi G hold the perturbation of the coordinates of the nodes 
Xi^ Xi^i specified by Sp. If we insert 

a{t) = 6i{t)SXii^{t), t e [z - l,z], z = 1, . . . , /c, 

into (18) we obtain 



k 

J'{p)Sp = ^f K{t)Bi{t)5Xiu{t)\i{t)\dt. 

i=l 

If we just perturb the jf-th coordinate of the l-th control node by Sr, j = 
1,2, this will affect only the Bezier arcs Fp^i and Fp^i^i determined 

by (m/_ 2 ,x/_i,m^_i), {mi^i,xi,mi) and (m/, x/+i, m/+i), respectively. This 
results in 



J'{p)Sp = i [//_2 + //_J IC{t){l + 2t{l - t))pj{t)\y{t)\ dt 

+ IC{t){l - t)^i'j{t)\'f'{t)\dt] St :=gijST 

(19) 

with obvious modifications for / = 1 and I = k. Hence the gradient J'{p) is 
represented by the matrix G G with entries Gij = gij determined by 

(19). 

We demonstrate the feasibility of this approach by applying this concept to 
the Bernoulli problem described in Section 3. Figure 5.1 shows the computed 
free boundaries (solid line) after 13, 12 and 19 optimization steps based on 
the gradient information supplied by (19) for Q = — 1.0, Q = —0.5 and Q = 
—0.35, respectively. The dotted line indicates the initial configuration, stars 
the corresponding initial and circles the final positions of the control nodes. For 
comparison purposes we also show the free boundaries (dashed line) obtained 
in Section 3 by a global method. Figure 5.2 illustrates the decrease of J on 
a logarithmic scale. 

3.3 A dam problem 

We conclude this paper by solving a dam problem [4] , [5] . The classical bound- 
ary variation approach has been used in [1] and [7] . The vertical wall Q. made 
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Fig. 5. Exterior BFB (gradient approach) 



of a non- homogeneous material separates two water levels of height yi, 7 / 2 - One 
wants to find a curve separating the wet and dry part of The mathematical 
model leads to the following free-boundary problem: 



' Find C Cl and u : Cl^ W such that 
— diY {kVu) = 0 in Cl^, 

< u ^ yi on Ti, i = 1,2, 

= 0 on To ur,^, 

, u = y on r,,5urtr. 



( 20 ) 



where k € L°°{d), k > ko > 0 and the partition of dCly, into To, Ti, F 2 , 
and Ta is shown in Fig. 6. The Neumann condition on F^^ will be satisfied by 




Fig. 6. Geometry of the dam problem 
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minimizing the cost functional J{(p) = 511 ^^ 11 - 1/2 whereas the rest of 
system (20) defines a well-posed state problem (V{(p)) for any (f G If O 
is made of a homogeneous material, corresponding to a constant /c > 0, 
consists of smooth, concave and decreasing functions defined over Fq. After 
substituting u := u — y ioi given ^ G we obtain the new state problem 
with homogeneous Dirichlet data on Fo- U Ft^: 

' — div (kVu) — ^ 

u = yi-y on Vi, i = 1,2, 

< 

u = 0 on F(^ U Fa, 

f = 1 on To. 

The FD formulation of (V{(p)) is now straightforward: 

' Find (il, A(^) GVg x such that 

< f kVu-Vvdxdy = - [ k— dx dy F {Xip,v)r^ Vu G Vo, (V{(f)) 
JQ Jn dy 

^(/x,«)r, =0 Vm e F-i/2(r^), 






where 

v^ = {ve H\n)\ v = gondQ\ Fq}, 

( yi-y onTi, i = l,2, 

Vb := Va=o, 9 — { 

[OonaQ\(FiUF2UFo) 

and k is the zero extension of k from to Q. The cost functional to be 
minimized now takes the form 

— 5ll"^<^ll-i/2,r^’ 

where i/y is the ^/-component of ly. In computations the cost functional (21) 
is replaced by the I/^(F(^)-norm. As before the spaces and are 

approximated by continuous, piecewise linear and piecewise constant functions, 
respectively and the unknown component F(^ by piecewise, second order Bezier 
curves. The example was solved with the following data: ft = (0, 1.62) x (0, 4), 
h = 1/32, yi = 3.22, y 2 = 0.84, k = 1 and the number of Bezier segments 
K = 6. The free boundary found by the MCRS method after 2000 function 
evaluations is shown in Fig. 7. 



Conclusions 

The variant of the FD method presented in this paper provides an efficient 
computational tool for the numerical realization of shape optimization prob- 
lems. Its main features are the following: 
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Fig. 7. Found free boundary 



- it enables us to solve efficiently state problems; 

- the use of non- fitted meshes in discretized problems avoids remeshing of the 
fictitious domain after any change of the shape of an optimized structure. 
Consequently, the whole process is more “user friendly” compared with 
the standard boundary variation technique; 

- the (co)-normal derivative of the state which appears in the shape deriva- 
tive of cost functionals is a part of the solution to the FD- formulation; 

- it can be utilized with minor changes for the numerical realization of a large 
class of free-boundary problems. 
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Summary. In this paper we show the existence of weak solutions for a nonlinear 
elliptic equations with arbitrary growth of the non linearity and data measure. A nu- 
merical algorithm to compute a numerical approximation of the weak solution is 
discribed and analysed. In a first step a super-solution is computed using a domain 
decomposition method. A numerical example is presented and commented. 



1 Introduction 

This work deals with weak solutions of the following quasi-linear elliptic prob- 
lem with Dirichlet boundary conditions: 

f +G{t,u'{t)) = F{t,u{t)) + f{t) in (0,1) . . 

\w(0)=w(l) = 0 

where G, F : [0, 1] X IR — ^ [0, H-oc[ are measurable and continuous with respect 
to and u, f is a, given finite non negative measure on (0, 1). 

The main goal is to present a numerical analysis of this weak solutions 
and to study their existence and uniqueness. Such problems arise from biolog- 
ical, chemical and physical systems and various methods have been proposed 
to study existence, uniqueness, qualitative properties and numerical simula- 
tion of such solutions (see [8]). When / is regular, it is proved in [9] that 
if (1) has a nonnegative super-solution in then (1) has a solution in 

Many authors dealt with this problem when / is irregular and 
G is sub-quadratic with respect to u' namely: 

\G{t,r)\<c{g{t) + \r\'^), g{t) £ (0,1), c> 0 (2) 

They showed that, if G satisfies (2), (1) has a solution u G Hq[0^ 1) provided 
that (1) has a super-solution in 1) see [6], [5]. The case where the 

super-solution itself is irregular has been treated in [2], if the super-solution 
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belongs to Hq{0, 1) then (1) has a solution in Hq{0, 1) provided that G satisfies 

( 2 )- 

In this work we are particularly interested in situations where / is irregular 
and where the growths of G with respect to u' and F with respect to u are 
arbitrary. Let us make some precisions about the model problem: 

r -u”{t)-\-\u'{t)yi = \u{t)\^ + f{t) in (0,1) , . 

\ m(0) = m(1) = 0 ^ 

where p,q > 1 and / G M^(0, 1), the set of nonnegative finite measure on 
(0, 1). We show here that if the semi-linear problem: 

r = + in (0,1) 

\ w{0) — w{l) — 0 ^ ^ 

has a solution then (3) has a solution. Remark here that no restrictions for 
p and q are imposed. For an elegant study of (4), see the work of Pierre and 
Baras [4]. If ic'(O) = 4-oo or w'{l) = — oo then w ^ and obviously 

the classical approach fails to provide existence of solutions of (3) and new 
techniques have to be used. 

Another approach studied here is the numerical approximation of the so- 
lution of problem (1). In this approach the most important difficulties are to 
determine the uniqueness and the blowup of the solution. 

The general algorithm for the numerical approximation of this equation is 
the application of the Newton method to the discrete version of problem (1): 

Find U such that AU = H{U) (5) 



where A is a sparse matrix and H : IR’^ IR"^ is a nonlinear operator. 

The Newton algorithm is given by: 

{ chose in a neighbourhood of the solution 
and solve until convergence (6) 

- U^) - + H{U^) 

where is the Jacobian matrix of the operator H computed in and 

/d is a identity matrix in IR^. This method converges quadratically when it 
converges. Convergence depends in particular on the choice of and on the 
existence and uniqueness of solutions of the linear system (6). In the case of 
problem (1) the matrix A — H'{U^)Id is often singular. 

To overcome this difficulty we introduce a domain decomposition to com- 
pute an approximation of Su^ = by the resolution of a sequence of 

problems of type (1) in subset Qi of (0, 1)? such that i? = [J Qi. The idea 

i=l,K 

of the method comes from the following remark [11]: 

Lemma 1. Let 0<a<b<l, aiE L°°(0, 1)^ for i = 1 ^ 2 . If \b — a\ is small 
enough then the operator — — a 2 {t)Id has an inverse in (a,b). 
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We have organized this paper in the following manner. In section 2 we give 
the exact setting of the problem, we present an approximate equation for (1) 
and we prove that the existence of weak super-solutions implies the existence 
of weak solutions, without any restriction of the growth of G with respect to 
u', this result being a generalization of the classical result of [9], [6] and [2]. 

In section 3 we present an approximation scheme for problem (1) based on 
the Schwarz overlapping domain decomposition method, combined with finite 
element method. 

This work was supported by the French Grant ’’Action Integree MA/02/33”. 



2 Mathematical analysis of the problem 

Throughout this paper we suppose that / is a nonnegative finite measure 



on (0, 1) and G, F : [0, 1] x IR — > [0, +oo) are measurable. 

The functions r G{t,r), F(t,r) are continuous a. e. t (7) 

F{t , .) is nondecreasing and G(t, .) is convex, (8) 

Vr G IR, G(.,r), F(.,r) are integrable on (0, 1) (9) 

G{t, 0) = min{G(^, r), r G IR} = 0 and F{t, 0) = 0. (10) 



We introduce the notion of weak solution, super-solution and sub-solution used 
here. 

Definition 1 . A function u is said to be a weak solution of (1) if 

! H€<r(0>i)nCo[0,i] 

} — u” (t) + G(t, u'(t)) = F(t, u(t)) + / in V' (0,1) ^ 

[replace in (11) = by > for a weak super- solution and by < for a weak sub- 
solution) 

Remark 1. In (11) u G (0, 1), using (9) we have G{t^u'(t)) and 

F(t,u(t)) G (0, 1). Hence every term in (11) makes sense. 

This enables us to state the main result of this paper. 

Theorem 1. Assume that (7)-(10) and f G M^(0, 1) hold. Assume that there 
exists a weak solution w to the problem, 

f weWl’:°{0,i)nCo[0,i] ... 

X -w" = F(.,w) + f inV'{0,l) ^ ’ 

Thenw is a super- solution of {1) and there exists a weak solution u of (1) such 
that u <w. 
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Remark 2. 1) It should be noted that there are no growth restrictions on the 
lower order nonlinearity of F and G w.r.t. u and u' respectively. Hence the 
present theorem extends some results proposed in [2] , [6] . 

2) For any finite nonnegative measure /, the problem: 

-w" + G{tM) = / in P'(0,1) ^ > 

has a unique solution 2 ^, see [1], and remark here that 2 ^ is a sub-solution of 
the problem (11). 

In order to prove theorem 1, we consider Gnit ^ .), for n > 0, the Yoshida 
approximation of G{t, .) which increases a.e. to G{t, .) as n tends to infinity. 
Note that Gn{t , .) satisfies (7) -(10) and 

Gn<G,Gn< Gn+1 (14) 



According to the result given in [1], [6] there exists a sequence {un) of solution 
to the problem: 

r € <°°(o,i) . . 

\ = FK) + / in V'{0,1) ^ > 

where uq — w. To have estimates when passing to the limit, we give the 
following three lemmas, see [1]. 

Lemma 2. Let a(t) G L/^^(0, 1), v G W^^4(0, l)n Co[0, 1] such that 



a(t)v'{t) € lLc(0,l) 
—v" — a v' >0 in D' (0, 1) 



(16) 



Then v > 0 in [0, 1]. 

Lemma 3. Let u G (0, 1), v, F G I/°®(0, 1) and ji G M^(0, 1) such that: 



V < u < V in ]0, 1[ 

— u" < la in P' (0, 1) (17) 

—v" '>11 in V' (0, 1) 



Then u G W^^’^(0, 1), and 

|m'( 0I < a b) +M\mb) (18) 

for all 0 < a < b < 1, where d{t;a,b) = min(b — t^t — a) and c{a^b) is 
a constant depending on a and b. 

Lemma (3), will provide (0, 1) estimates for the approximate solution 
Un- But this estimate does not allow us to pass to the limit in the nonlinear 
terms. We need the strong convergence of Un in We obtain this 

result from the following Lemma. 
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Lemma 4. Let {un)n C Wq^’°°(0, 1) such that, 



u strongly in L°°(0,1) 


(19) 


W_ < U < Uni 




— u"<fi in V' (0, 1) 


(20) 


w" < {i in V' (0, 1) 





Then u!^ — ^ u' strongly in r^c(o>i) 

Proof of the theorem (1). In a first step we have by direct application 
of the maximum principle (see [2]) that 

m ^ '^n+i for all n > 0 (21) 

In a second step, since F is monotonous w.r.t. r we can prove by induction 
that 

Un < w in [0, 1] for all n > 0 (22) 

By lemma (3) is bounded in W^^’^(0,1) D Co [0,1] independently of n. 
Therefore, there exists a subsequence, still denoted by (un) for simplicity, 
such that converges to u strongly in L°°(0, 1) if n ^ oo. Also converges 

to u' strongly in L[^^(0, 1) and a.e. in (0,1). Then from lemma (3) we have 
'^n+i converging to u' strongly in r~c(0,l), and 

l|w^l|L~(a,6) < K{a,b){c{a,b) + ||wl|ioo(o,i) + ||/||mb + lli£||L~(o,i)) (23) 

where K{a,b) = 1/r] and 0<77<a<?7-t-^<l. 

Since C(t, .) and F{t, .) are continuous with respect to the two last argu- 
ments, we have for all 0 < a < 6 < 1 



G{t,u'^^i ) , F{t,Un) converges to G{t,u ') , F{t,u) a.e. t G (0,1). 
On the other hand, for a.e t G (a, h) 

|C(i,<+i(0l < max \G{t,r)\ = 6>(i) 

|r| <C'(a,6) 



(24) 

(25) 



and 



|F(^,n„(0)| < n,-n II M 

|s|<max( ||t<^||Loo(o,i)) IIhIIl°«(o,i)) 



m ( 26 ) 



and 6,6 ^ (0, 1) from (9). Using Lebesgue’s dominate convergence Theo- 

rem (see [7]), we also have; 

G{t,u'^_^i), F{t,Un) converges to G{t,u'), F{t,u) in (a, 6) respectively 

(27) 

Now, we can pass to the limit in (15), and ii (p G P (0, 1) with sup p> C [a, b] 
then 



0 lim (-<+1 + C«_^.i) - F{un), (f) = {-u" -f G{u') - F{u),ip) 

(28) 

where (., .) denotes the duality pairing between P'(0, 1) and 7^(0, 1). The the- 
orem follows. 
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3 Numerical method 



In this section we present the numerical method to solve equation (1). Formally 
the iterative method brings out a sequence of numerical solutions of (15) in 
i^d (0,1) with a first guess which is a super-solution of (1), in our case, a 
solution of problem (12). 

Then the algorithm can be formulated in the following way: 

1) Find w G Hq{0^ 1) such that: 

(Pi) -uJ" > P(.,uJ) + / (29) 

2) Given uq = w we compute a sequence, solution in Hq{0, 1) of the non 
linear equation: 

(P 2 ) — '^n+l + ^n+l(^, '^n+l) = F{Un) + / (30) 

Both problems (Pi) and (P 2 ) are nonlinear, and if (Pi) has a solution, 
in theorem (1) we prove that the solution of (P 2 ) is also a solution of the 
equation (1). Problem (P 2 ) has a unique solution and the numerical calculation 
is straightforward by Newton method. To solve the nonlinear equation (Pi), 
which presents some interesting difficulties, we construct a sequence such 
that is a solution of a linear problem and converges to w. 

Let up = 0, we define z=z S where is the solution of the 

following linear problem: 

CP ') f -(5" - ^^^<5 = (w'=)" + F(w'=) + / in (0,1) 

^ \ 5(0) = 5(1) = 0 ^ ^ 



Then at each iteration we have to solve the linear problem (P3). To achieve 
this, we consider a weak formulation of the problem and use the finite element 
method. 

To simplify the text we reformulate (P3) in the following way: find v G 
HQ{a^h) such that: 



{p^) 



— v{t)" + c{t)v{t) = h{t) in (a, 6) 
v{a) = v{h) = 0 



(32) 



where h G the set of finite measure in (a, 5), and c{t) G L‘^{a,b)^ 

without restriction on its sign. We assume Cqo = \\c\\L°°{a,b) is bounded. 

According to Lemma 1 , problem (P 4 ) has a solution in a domain (a, 6 ) 
small enough. 

If V = Hq{0^ 1) then the weak formulation of (P 4 ) is: 

nb nb 

find V gV : a{v,w) / v' w' + c{x) v w dx = / hwdx = {h,w) \/wmV 
J a J a 

( 33 ) 
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Thanks to Poincare inequality we have: 



a(w,w) — {w\w') -f- (c{t)w^w) > ( 



\b-a\ 



Coo){w,w) 



( 34 ) 



Thus the bilinear form a{w^v) is coercive if |5 — a| < 

This remark is of great interest, because it can be exploited to obtain 
a numerical solution of (P 4 ) using a domain decomposition technique. In other 
words, this means that the domain partition should be determined by the 
behavior of 1 1 



The aim of this paragraph is to introduce the Schwarz overlapping domain 
decomposition method [ 10 ] applied to problem (P 4 ). To simplify, without lost 
of generality, we assume that we can consider a two domains decomposition 
(a, h) = (a, (3) |J((a, h) such that: 



with a < j3 and [j3 — a) — a) 



< mini 





(35) 



Then, if is an initialization function, defined on (a, b) and vanishing at a 
and 5, we define for A: > 0, two sequences i = 1,2 solving the following 
problems: 



f -(v^+^)"(t) + c(t)v^+q 0 = h{t) in (a, 13) 



(36) 



and 



+ c{t)v^^+\t) = h{t) m{a,b) 

\ v^^+\a)=v^,{ay, vl+\b) =0 



(37) 



Now to prove the convergence of the Schwarz overlapping domain decom- 
position algorithm applied to problem (P 4 ) we consider two problems: 



/ + c{t)vi{t) = h{t) G {a, (3) 

\ vi(a) = 0 ; vi{(3) = V 2 (/ 3 ) 



(38) 



and: 



(P iJ “'^2(0" + c{t)v2{t) = h{t) in (q;, 1 ) 
\ V2(q) = Vl{p), V2{b) = 0 



(39) 



Let u be u = ui in (a,/3), v — V2 in {a^h), then vi = V2 in {a, ( 3 ). We 
suppose the existence of a solution of P 4.1 in C{a,P) and a solution of P 4.2 in 
C(a,6). 



Theorem 2 . Assume a^b,a and (3 are such that a < (3, (/3 — a), {b — a) < 
min(^, 2 ^^=). Then the sequence converges to v in C{a,f3) and C{a,b). 

Proof: 

Let d^ = Vi — V in (a, p) and e^ — V2 — v in (a, b). We prove the following 
inequality: 
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IM"+2||oo < 7 |K"||oc and ||e'=+2|U < 7l|e"i|c 



(40) 



where 7 < 1. 

The difference satisfies the following equation. 

(P ^/ + c(i)d'=+i(t) = 0 in (a,/3) 

j ^fc+i(a) = 0 ; = V^{f3) - v{/3) 

and e* satisfies a similar equation in (a, b) with boundary conditions 
= d^{a) ; 
e'=+i(6) =0. 

If we consider the following equation: 



{p^ 



f -<p{ty - = 0 in (a,/?) 

\ ip{a) = 0 ; ip{P) = |e''+i(/3)| 



then (p{t) 



0^+1 



m 



sm{y^{t - a)) 



(41) 



(42) 



sin(y/c^ (/? - a)) 

. If we consider the difference 2: 



. This solution is unique and positive 
(p — it is easy to 



if (/^- a) < __ 

prove that 2: > 0 and If now z = 99 + we have 

also z > 0 and —\e^'^^{/3)\ < ^ < , Vt in (a,/?). Then the inequality 

llc^^'^^lloo < |e^+^(/3)| < ||e^+^||oo holds. 

To prove that |e^‘^^(/3)| < 7 |ld^||oo with 7 < 1, we consider the equation: 

(P\l ~ = 0 in (a, 6) 

^ \ (f){a) = |d^(ce)| and = 0 ^ 



The solution of this equation is given by: (bit) — |d^(o;)| — This 
solution is positive if {h — a) < 2“^= have (pit) > Vt G (a, b). 

Then |e^+^(/?)| < p{^) < 7|d^(of)| with 7 = rpj^^ 

sin(^V^ Cqq yb Qfj 

ficient 7 is smaller than 1 only if a < j3. 

In conclusion we have II 1 1 00 < ||d^||ooifo^ < fb and {/S — a) , {b — a) < 

min(^, ) . In the same way we prove that ||e^+^||oo < Ik^Hoo if < /3 

and {P- a),Jb - a) < min(^, 

We conclude that the Schwarz overlapping domain decomposition method 
applied to problem (P4) converges. 



4 Numerical Results 

The algorithm introduced in the previous section has been implemented nu- 
merically for the model problem (3) with p = q = 3 and f{t) a Dirac in -. 
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To study the convergence history of the numerical simulation plotted in 
figure 1 we consider two steps. In the first step, where we compute a super- 
solution, we observe the evolution of the number of sub-domains: it goes from 
m = 2 sub-domains to m = 10 sub-domains in five iterations according to 
criterion (35). Simulation stops after 17 iterations when the residual is of the 
order 10“^^. 

In the second step, starting with the super-solution computed in the previ- 
ous step we perform nine iterations of the Yoshida approximation described in 
section 2 and the simulation stops when the correction computed is in uniform 
norm of the order 10“^^. 




computed solution 
computed super-solution 



Fig. 1. Example f{t) = 8. * t m=10 



References 

1. Alaa, N. (1989): Etude d’equations elliptiques non-lineaires a dependance convexe 
en le gradient et a donnees measures, These de Doctorat, Universite de Nancy I 

2. Alaa, N., Pierre, M. (1993): Weak solution of some quasi- linear elliptic equations 
with data measures, SIAM J. Math. Anal., 24, 23-35 

3. Alaa, N., Iguernane, M. (2002): Weak periodic solutions of some quasi- linear 
parabolic equations with data measure, J. Inequal. Pure Appl. Math. 3 

4. Baras, P., Pierre, M. (1984): Criteres d ’existence de solutions positives pour des 
equations semi-lineaires, Annales Fourier Grenoble, 24, 1985-2006 

5. Bensoussan, A., Boccardo, L., Murat, F. (1988): On a nonlinear partial differen- 
tial equation having natural growth terms and unbounded solution, Ann. Inst. 
Henri Poincare, 5, 347-364 

6. Boccardo, L., Murat, F., Puel, J.P. (1989): Existence results for some quasi- linear 
parabolic equations. Nonlinear Analysis Theory Method and Applications, 13, 
373-392 

7. Brezis, H. (1983): Analyse fonctionnelle theorie et applications, Masson 




88 



N.E. Alaa, J.R. Roche 



8. Levin, S.A., Hallam, Th. G., Gross, L.J. (1989): Applied Mathematical Ecology, 
Biomathematics 18 , Springer Verlag 

9. Lions, P. L.(1980): Resolution de problemes elliptiques quasilineaires. Arch. Ra- 
tional Mech. AnL, 74 , 335-353 

10. Quarteroni, A.,Valli, A. (1999): Domain decomposition Methods for Partial Dif- 
ferential Equations, Oxford Science Publications 

11. Witomski, P. (1983): Sur la resolution numerique de quelques problemes non- 
lineaires. These d’Etat, Universite Scientifique et medicale de Grenoble 




Variants of Releixation Schemes and the Lattice 
Boltzmann Model Relcixation Systems 



Mapundi Kondwani Banda^ 

Darmstadt University of Technology, Schlossgartenstr. 7, D-64289 Darmstadt 
banda@mathematik. tu-darmstadt. de 



Summary. In the low Mach number limit of the Lattice Boltzmann type models 
one obtains the incompressible Navier-Stokes equation. This is achieved by asymp- 
totic analysis. Moreover in the course of this analysis, the Lattice Boltzmann Model 
reduces to a relaxation system which can be discretized using relaxation schemes. 
We present two variants of the relaxation schemes characterized by local approxima- 
tion of characteristic speeds and a multidimensional flux approximation. These are 
applied to relaxation systems. Their performance will be discussed with reference to 
test cases of isothermal incompressible flow. 



1 Introduction 

Many kinetic equations or discrete velocity models of kinetic equations yield 
in the limit for small Knudsen and Mach numbers an approximation of the 
incompressible Navier Stokes (INS) equations. A classical example is given 
by the discrete velocity models used for Lattice-Boltzmann methods, see [1]. 
These discrete velocity models can be viewed as relaxation systems for the INS 
equations. 

Relaxation type schemes have been used successfully to discretize such 
relaxation systems. In particular, a large number of numerical methods for 
kinetic equations with stiff relaxation terms have been considered in fluid dy- 
namic or diffusive limits. For these relaxation methods and asymptotic pre- 
serving methods, we refer to [2, 3] and for more general applications of relax- 
ation schemes to the recent review paper [4] . Such a multiscale based approach 
provides an alternative to understanding the numerical transition from kinetic 
models to the continuum models. This can be used as a platform for developing 
alternative numerical schemes for INS. In the context of hyperbolic conserva- 
tion laws relaxation schemes provide efficient high resolution and Riemann 
solver free numerical methods. Here we introduce an HLL type of relaxation 
scheme and apply it in the context of INS. Further by choosing a different 
method for computing cell averages a scheme that can be considered multidi- 
mensional is realized. This paper follows closely the work presented in [6] . 
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2 Lattice-Boltzmann type Discrete Velocity Models and 
Simplified Relaxation Systems 

2.1 The Lattice-Boltzmann moment system 

Consider kinetic equations with a diffusive scaling with a small parameter, e, 
together with a rescaling of velocity. This scaling describes the small Knudsen 
and low Mach number limit of kinetic equations, see [7] for details. Under these 
transformations, we obtain 

^ + Iv . V/ = - J^(/ - =: J{f). (1) 

which describes the evolution of a particle density /(x, v,t) with x = {x^y) G 
IR^ and v = (^ 1 ,^ 2 ) G IR^. For discrete velocity models in two-dimensions 
(2-D) consider a model with nine velocities {N = 9). In the discrete case, 
the v-dependence of the particle distribution /(x, v,t) is uniquely determined 
through 9 functions /^(x, t) = /(x, c^, t), i = 0, . . . , 8. where Ci are discrete 
velocities. 

A discrete moment M of order m G IN of / is defined by M(t,x) = 
(/(x, V, t), P(v))^ , where P is a v-polynomial of degree m G IN. In the follow- 
ing we denote the components of the velocity by u = (u, v). The scalar product 
is defined as {gi,g 2 )y^ = IZv 5 'i(^) 5 ' 2 (v). Equation (1) is transformed into an 
equivalent set of moment equations (see also [8, 9] for a similar approach) 
using moments based on v-polynomials [10]. The mass and momentum den- 
sity are given by the zeroth and first order moments of /, (/, 1)^ = p and 
(/ 5 'i^i)v = The second order moments form a symmetric 

tensor, © = (0^ €)^), and the remaining moments are set to q and s. The 

equations of mass and momentum conservation are 

^tp+divpu = 0, + div0 + = 0. (2) 

Here, the divergence is applied to the rows of 0. The equation for © is 

+ ^S[pu] + iq[q] = -pu(g)u), (3) 

where 

o.r 1 1 / 2dxU dyU-{-dxv\ 

®I"1 = 2U« + 3.” j- 

Since Q[q], q and s are not needed to derive INS, they will be ignored. 

From the momentum equation in (2) one can deduce that as e — > 0, p 
approaches a constant p and can be written as p = p(l + 3e^°^p) to obtain 0(1) 
terms. Hence 

dtp + div u = — divpu, dtVL -\- div ^0 + Vp = — (4) 
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For e ^ 0, equation (3) yields at the lowest order 

1 

-0==u(8)u — Sful. (5) 

P 3 

Since 2divS[u] = (Z\ + V div)u, we obtain from (4) and (5) the INS equations 
as limiting system for a = 1 

divu = 0, + divu (g) u + Vp — (6) 

O 

where the Reynolds number is related to the relaxation time by i?e = 3/r. 
For 0 < a < 1, we obtain the incompressible Euler equations 

div u = 0, dtVi + div (u g u) + Vp = 0. 



2.2 Simplified Relaxation Systems 

By neglecting the lower order terms in the above equations (3) and (4) and 
setting p = 1, we can introduce a simplified relaxation system: 



dtp + div u == 0, dtVL + div © + Vp = 0, 



dt& + V^\u] 



t2a 



snui 



1 



(7) 



(^1 + 



— (0-u®u), 



where 






S^[u] = S[u]-— V>], 



We have added and subtracted the term 



V“[u] = {a^d^u,b'^dyu) 



/ a^dxU u\ 

yo^dxV b‘^dyV J ’ 



with a = 



G IR^. Obviously the limit equations for this system are again 



the INS equations with Reynolds number Re — 1/r. 

Considering the nonstiff advection parts in (7) separately for u and 0, we 
obtain a hyperbolic system with characteristic speeds ±a and in x and y 
directions: 



H- div 0 = 0, dtS + V^[u] = 0. (8) 



As we will see in the last section, a is chosen depending on the local speed. In 
the x-direction (8) can be written as: 
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A variant of this scheme incorporates a simple linear HLL-type Riemann solver. 
To incorporate the HLL type formulation, equation (9) is written as: 

(©-)/ (-a-a+i° (a-+a+)l) 

where a~ and a+ are determined by some algorithm [17]. The submatrices 0 
and I are 2x2 zero and identity matrices. Observe that if a~ — —a~^ the 
formulation (9) is obtained. 



3 Numerical Schemes 

3.1 Space Discretizations 

To discretize the equations in space a uniform grid in x and y with grid points 
(xi^yj) with spacing h is used. Consider the linear system (8). For the x- 
direction ± au are the characteristic variables associated with the charac- 
teristic speeds ±a while for the y-direction the characteristic variables associ- 
ated with the characteristic speeds ±6 are 0^±6u. Similarly the characteristic 
variables associated with the x-direction for the HLL type scheme in equation 
(10) are 

w+ = , , (©^ - a"u) and W~ = , 7 ^^" , (©^ - a+u). (11) 

(a+ — a“) (a+—a~) 

The values of the characteristic variables will be determined at cell edges 
as in [2]. Hence the numerical fluxes at cell boundaries are: 

= I - I - U,,) 

+ ^ (Q"" + «u) - (©"^ - au)) , 

Ui+l/2,- = ^ (uy + Ui+y) - ^ (©?+y - &fj') 

+ au) + cr^+ij(&^ - au)) 

in the case of second order method. If minmod slope limiting is applied then 
a-j are given by o--j(z) = 5minmod(zy - Zj_y,Zi+y - z^). 

For the HLL type scheme the numerical fluxes at cell edges are 
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" (a+ - a-) { (a+ a") 

- Q+ ("»+« + Uy) + 5 (4(^^) + <^l+ij(^”)) }• 

We denote by F^, the discretization of the convective parts div© and 
V^[u] in equation (7), respectively. Using the numerical fluxes above one ob- 
tains 

f;(0,u) = j(e-+./a - et./w) + ^ (e?,«/2 - efj-./j). 

F|(®>u) = (i(a2ui+i/2j -a2ui_i/2j),i(62uy+i/2 -62uy_i/2)). 

And for the HLL type scheme we get the following convective term: 

F|(®>u) = (^(^-a-a+Ui+i/2j + a-a+Ui_i/2j^ 

+J^ ((^~ + '^'^)®i+i/2,j ~ > 

-(^-6~6+Uy+i/2 + b~b+Uij_i/2^ 

+ -1/2)) ■ 

To obtain a multidimensional scheme the computation of the cell averages is 
modified. To take advantages of diagonal points of the cell the trapezoidal 
approximation is used. The cell averages are thus defined as: 

= \ (/("D + ) + /("D) (12) 

where i , i ), =pij{Xi^i,yj_i) and Pij{x,y) = 

Zij 4- (zx)ij{x - Xi) -f {zy)ij{y - yj). The slopes {zx)ij and {zy)ij are (at least 
first order) approximations to derivatives Zx and Zy, respectively. The flux is 
the equilibrium flux (5), /(u) = u (g) u — (2e^“^r)/3S[u] at e ^ 0. 

Denote the discrete gradient by Gh and the discrete divergence by Dh 
which are given by second order centered differences. Second order centered 
difference approximations of and S are denoted by and S^. Hence we 
obtain the spatial discretization: 

P + '^'^h • u = 0, u + FJ^(0, u) + GhP = 0, 

e + F|(e, u) + jlsuu) = (e - u ® u) 



( 13 ) 
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or equivalently 

T>h • GhP - = -T>h • F^(0, u), u + F^(0, u) + GhP = 0, 

© + Fli&, u) = (® - u ® u + 2ei-“rS^(u)) . 

A corresponding high order upwind based space discretization for the INS 
equations is obtained considering the limit of the above discretization as e — ^ 0 

Dh • GhP = -Dh • Fft,(u), ii = -F^ ® u - 2e^““rS?j(u), - Ghp. 



3.2 Time Discretizations 

To treat only the limit equations (e = 0) any explicit high order Runge-Kutta 
method [11] can be used in combination with a Poisson solver and the limiting 
(relaxed) spatial discretization. 

Further one would want to discretize the relaxation system (13) for all 
ranges of the parameter e. This allows the study of the numerical passage 
from the Boltzmann to the INS regime. An implicit-explicit (IMEX) Runge- 
Kutta method of the type recently developed in [12] is used as suggested in 
[6]. We denote the time step by k and use superscript n to denote the time 
iterations. For the second order time discretization a two stage IMEX Runge 
Kutta method [12] which guarantees second order accuracy in the stiff limit is 
chosen. For e 0 this scheme suggests a formulation for a second order time 
discretization of INS equations i.e. the projection is taken at every step: 

Step 1: 

^pn+l/2 ^ ^ , 

u»+i/2 = u” - fc7(div0” + Vp”+i/2), 

©"+1/2 = ^n+1/1 ^ ^n+1/2 _ 2£l-“rS[u”+l/2] . 

Step 2: 

Z\p"+1 = ^div(u"-fc((5div©” + (l-(5)div©"+i/2 + (l-7)Vp”+^/2)), 

u"+i = u" - fc (d div 0” + (1 - , 5 )©”+i /2 ^ Vp"+i/2 + 7 Vp”+^) , 

©n+i ^ ^n+1 ^ ^n+1 _ 2ei““rS[u”+^] . 

with 7 = 1 — \/2/2 and d = 1 — 1/27, which gives a scheme for the incompress- 
ible Euler equations (0 < a < 1) and INS equations for a = 1. 
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Guided by the above formulation we obtain the second-order modified 
Runge-Kutta scheme for INS: 

Step 1: 

^pU+l/2 ^ 1 _ Ji-div©" 

K \ 

Step 2: 



j , u”+i/2 = u" - fc ( div 0” + . 



Z\p"+i = I div (u" - ^ ( div 0" + div 0"+i/2) ) , 
u«+i = u" - |(div0” + div0”+i/2^-^Vp"+^ 

In addition, we do not have to compute © at every time step unless it is 
needed. We rather compute div 0 instead. This reduces the number of vari- 
ables. The usual hyperbolic and parabolic CFL conditions have to be fulfilled 
to guarantee stability. 



4 Numerical Examples and Results 

The cases for € = 0 will be tested in this section. The space discretization in 
equation (13) will be applied. For slope limiting the van Leer limiter is used. 
Example 1: Taylor- Vortex Flow - Accuracy test 
Relaxation schemes will be first tested on incompressible Euler equations, 
i.e. u = r = 0.0 augmented with smooth periodic initial data. The test admits 
the following exact solution [13]: 

u(x, y^t) = — cos( 27 tx) sin(27ry) exp(— 2zyf); 
u(x, y, t) = sin(27Tx) cos(27ry) exp(— 2z/t); x^y E [0, 1]. 

Taking h = 1/N with N = 32, 64, 128, and 256, the solution is computed up 
to t = 2.0 and the and norms of the errors and convergence rates of u 
(the velocity component in the ^-direction) are listed in Table 1 below. 

The following schemes have been tested: CRtvd^ (Second-order relaxed 
scheme with TVD time integration); ^dTrk^ (Second-order relaxed scheme with 
DIRK time integration). 

All examples use the CFL number of 0.475 based on the local flow velocity. 

:= 2max{|wi.j|, |ui+ij|}; := 2max{|vij|, |wi,j+i|}. 

Example 2: Travelling wave - accuracy test 

This numerical test was used by Minion and Brown in [14]. The compu- 
tational domain of unit length is doubly-periodic. The exact solution of the 
Navier- Stokes equations for this problem is: 
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Table 1. Error-norms for the Incompressible Euler Problem with the initial condition 
in 14 at t = 2.0 using relaxation schemes, v — 0.0. 



N 


Scheme 


-error 




Rate 


L^-error 


Rate 


64 


rne=0,2 

‘^TVD 


2.99588 • 


10“® 


2.5683 


4.10701 • 10-® 


2.4753 




rne=0,2 

‘^DIRK 


2.80480 . 


10“® 


2.5351 


3.91299 • 10“® 


2.4382 


128 


rne=0,2 

^TVD 


6.16432 • 


10-^ 


2.281 


9.55745 • 10“® 


2.1034 




rn 6=0,2 
-^DIRK 


5.97678 • 


10“^ 


2.2305 


9.28992 • lO-"* 


2.0745 


256 


rne=0,2 

JVtvd 


1.13089 • 


10“^ 


2.4465 


1.8214 • 10-* 


2.3916 




me=0,2 

*^DIRK 


1.05389 • 


10“^ 


2.5036 


1.64960 • lO-"* 


2.4935 



u{x, y, t) = 1 + 2 cos(27r(a: — t)) sin(27r(y — t)) exp(— 
v{x^ y^t) = 1 — 2 sin(27r(x — t)) cos(27r(y — t)) exp(— (15) 
p{x, y, t) = — ( cos ( 47 t(x — t)) + cos(47r(y — t))) exp(— IGyr^z/t). 

Numerical results are presented in Table 2. 



Table 2. Error-norms for the Incompressible Euler Problem with the initial condition 
in (15) at t = 0.7 using relaxation schemes, u = 0.0. 



N 


Scheme 


Z/^ -error 


Rate 


L^-error 


Rate 


64 


rn 6=0,2 

JVtvd 


1.3222 • 10“^ 


1.4691 


1.60313 • 10-2 


1.5942 




rne=0,2 

•^DIRK 


1.29641 • 10“2 


1.5573 


1.57197 ■ 10“2 


1.5656 


128 


rn^=0,2 

JVtvd 


2.63745 • 10”® 


2.3257 


3.26585 • 10-® 


2.2954 




rn £=0,2 
'^DIRK 


2.59158 • 10"® 


2.3226 


3.20928 • 10-® 


2.2923 


256 


rne=0,2 

JVtvd 


3.18838 • 10-^ 


3.0482 


4.2203 • 10-^ 


2.952 




rne=0,2 

-^DIRK 


3.1598 ■ 10~^ 


3.0359 


4.18859 • 10-^ 


2.9377 



Example 3: Doubly Periodic Shear Layer 

This test was introduced by Bell, Colella and Glaz in [15]. In the periodic, 
two-dimensional computational domain of size [1 x 1], the following velocity 
fields are generated as initial conditions: 



u^{x,y, 




tanh(^(i/- 1/4)), y < 1/2; 
tanh(^(3/4-y)), y > 1/2; 



v^{x,y,0) = (Jsin( 27 ra;); 
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where g is the shear layer width parameter and 5 is the strength of the initial 
perturbation. The strength coefficient 5 = 0.05 is kept unchanged. The results 
of vorticity profiles are displayed using 20 equidistant contours. 

In figure 1 vorticity profiles of solutions obtained by applying different 
schemes without recourse to slope limiters are shown. 



Vorticity T = 1 .2 N = 1 28 Vorticity T = 1 .2; N = 1 28. 




Fig. 1. Thick Shear {g = 30) Layer Results for Euler case € == 0: Staggered Central 
Scheme [16] (a), Godunov Based Scheme [15] (b); Relaxation Scheme with TVD time 
integration (c); and Relaxation Scheme with relaxed DIRK time integration (d). 



Further a refinement of the grid to N = 256 was made. The same compu- 
tation was repeated with the van Leer limiter applied to the relaxation-based 
schemes. The results are shown in figure 2. In figure 3 a u velocity cut at 
X = 0.5 is shown. 
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Vorticity T = 1 .2 N = 256 Vorticity T = 1 .2; N = 256 




Fig. 2. Comparison of Thick Shear Layer Results for the incompressible Euler case, 
multidimensional vs. dimension- by- dimension approach: (a) second-order DIRK, (b) 
second-order TVD, (c) second-order multidimensional with DIRK, (d) second-order 
multidimensional with TVD. 

The relaxation system was also tested on the thin shear layer, p = 80, 
problem. The Navier-Stokes equation was considered for t = 1.0 using a grid 
of N = 256. A comparison of the time evolution of total kinetic energy and 
enstrophy in the incompressible Euler equation up to time t = 2.0 was made. 
Figure 4 presents the evolution of the decay of the total kinetic energy of the 
flow and a history of the mean enstrophy. 

In all the three examples we observe that the relaxed schemes (e = 0) 
perform reasonably well inspite of their simplicity. Much as the DIRK formu- 
lation suggests the projection structure, the direct TVD formulation has better 
qualitative results. The DIRK formulation tends to be more dissipative. Never- 
theless they both tend to achieve their expected convergence rate. Further we 
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V cuts at T = 1 .2, N = 128 for Second Order Schemes V cuts at T = 1 .2, N = 256 with DIRK time integration 




Fig. 3. The cut at a: == 0.5 of i’ at t = 1.2 for the Thick Shear Layer Problem 
computed with the staggered central scheme (’Central’), the Godunov projection 
method (’Centered’), the relaxation-based scheme with DIRK time integration (’Cen- 
tered DIRK’) and the relaxation- based scheme with TVD time integration (’Centered 
TVD’) (Left). The cut at x = 0.5 of at t = 1.2 for the Thick Shear Layer Problem 
computed with the Godunov projection method (’Centered’), the relaxation-based 
scheme (’2nd Order’) and the multidimensional relaxation-based scheme (’2nd Order 
Mult.’) (Right). 



Kinetic Energy for Incompressible Navier-Stokes Equation 



Enstrophy for Incompressible Navier-Stokes Equation 





Fig. 4. Comparison of total kinetic energy and enstrophy at t = 2.0, Re = 10000 for 
the thin shear layer problem computed with the Godunov projection method (’Cen- 
tered’), the multidimensional relaxation-based scheme with TVD time integration 
(’Centered TVD’) and the multidimensional relaxation-based scheme with DIRK 
time integration (’Centered DIRK’). 
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Vorticity T = 1 .0; N = 256; Re = 1 0,000 Vorticity T = 1 .0; N = 256; Re = 1 0,000 




Fig. 5. Comparison of Thin Shear Layer Results for Navier-Stokes case: (a) second- 
order multidimensional DIRK scheme, (b) second-order multidimensional TVD 
scheme. 



observed that for the DIRK scheme there is no significant difference between 
the multidimensional and dimension-by-dimension scheme. A comparison with 
other schemes shows that relaxations schemes are less dissipative than central 
schemes while on some occasion they are very close to the Godunov scheme. 
The resolution of the solution for thin shear layer problems shows that the 
scheme has a lot of potential for improvement. The implementation of the 
HLL formulation and relaxing schemes (e 7^ 0) is underway to investigate how 
this can be achieved. 

Acknowledgement: This work was supported by a DFG Grant KL 1105/9. 
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Summary. The aim of this paper is to present a relaxation scheme designed to 
approximate the solutions of the system of conservation laws which arises in the 
modeling of two-phase flows in oil and gas pipelines. The main idea is to relax only 
the highly non-linear closure laws, thanks to a Lagrangian change of coordinates. 
By construction, all the flelds of the nonlinear hyperbolic relaxation system are lin- 
early degenerated. In addition to the simplicity of the evaluation of the flux function, 
the relaxation coefficients can be easily devised so as to guaranty the positivity of 
the mass fractions. We propose to use an integration in time which is explicit with 
respect to the small eigenvalues and linearly implicit with respect to the large eigen- 
values. In the first part, we construct a second-order explicit relaxation scheme based 
on a Godunov method. In the second part, we present the semi-implicit relaxation 
scheme. A ‘‘stiff” numerical simulation of an industrial case is shown. 



Introduction 

The aim of this paper is to present a relaxation scheme designed to approx- 
imate the solutions of the system of conservation laws which arises in the 
modeling of two-phase flows in oil and gas pipelines [10]. The system, made up 
of three conservation laws, is a drift-flux type model and is closed by two ther- 
modynamic and hydrodynamic models. The complexity of these models make 
classical numerical schemes such as Godunov or Roe schemes very difficult to 
use. Our feeling is that only a “rough” scheme would be able to successfully 
meet the challenge of nonlinearities. 

An essential property of our system is that it possesses fast characteristic 
speeds (acoustic waves) and slow ones (corresponding to the mass transport). 
From the engineer’s standpoint, the crucial aspect is the petroleum transport 
and not the acoustic. That is why we will never use an explicit scheme, the 
time step of which is limited by the CFL stability condition: the time required 
for the simulation would be prohibitive due to the large characteristic speeds. 

Inspired by [7], I. Faille and E. Heintze have proposed in [6] a VFRoe-type 
scheme which is sufficiently rough (diffusive) to handle very stiff industrial 
cases. The scheme shown in [6] is linearly implicit with respect to the large 
eigenvalues and explicit with respect to the small eigenvalues. Therefore this 
semi-implicit scheme will combine accuracy on the waves propagating at the 
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small velocities and reduced CPU-time because the CFL time step limitation 
only applies to small eigenvalues. 

But the VFRoe scheme also suffers from a few drawbacks. The first one 
is that the scheme is still CPU-time consuming because it needs to compute 
numerically the eigenvalues of the Jacobian matrix of the system. The second 
one is that theses eigenvalues are not necessarily real, that is, the system is 
not always hyperbolic. It is observed in [4] that the system is hyperbolic only 
if the slip between the phases is not too large. 

This is why the authors of [2] designed an explicit relaxation scheme. The 
approach [2] is close in spirit to Jin &: Xin’s [11] but differs in the fact that 
only the genuine non-linearities are relaxed, namely the pressure and the hy- 
drodynamic laws. In a first step, the genuine nonlinearities are shown by means 
of a Lagrangian change of coordinates. Then, these are relaxed. Finally, the 
equations are brought back to the Eulerian frame. The resulting relaxation 
system is automatically hyperbolic and has all its fields linearly degenerated. 
This scheme is less CPU-time consuming than the VFRoe-type one because 
the most complex (algorithmically speaking) step is the computation of only 
two relaxation coefficients. The relaxation coefficients can be easily devised so 
as to guaranty two properties : first, the stability of the first order asymptotic 
system computed thanks to the Chapman-Enskog expansion and, second, the 
positivity of the mass fractions. As a first attempt to design a fast and rough 
numerical scheme for two-phase flows, these results were encouraging. 

The goal of this paper is to work out a semi-implicit version of the explicit 
scheme [2]. In essence, the extension is possible because of the linear degeneracy 
property of the relaxation system. 

The paper is organized as follows. In section 1, we introduce the two-phase 
flow model together with the boundary conditions and the characteristics of 
this model. In section 2, we develop the second order explicit relaxation scheme 
with the computation of new relaxation coefficients and boundary conditions. 
In section 3, we present the semi-implicit scheme and section 4 is devoted to 
the numerical results. 

Most of the results presented here are extracted from [1] , [2] and [3] . 



1 Two-phase flow model 

1.1 Equations 

In the flow, the mixture is characterized by its density p, its velocity v and its 
gas (resp. liquid) mass fraction Y (resp. X = 1 — Y). The model is governed 
by the following system of conservation laws: 



dt{p) + 9x(pw) = 0, 

dt[pv) + dx{pv‘^ + P{n)) = 5(u), 
dt (pY) + dx {pYv - (j(u)) = 0, 



( 1 ) 
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for all X G M and t > 0 where the unknown is u := (p^ pv^ pY). Here the source 
term S includes gravity and wall friction terms which are given functions of 
the unknown. The two following functions of the unknown u 

a{n)=pY{l-Y)^u), P{u) = p(u) + pY{l -Y)^(uf, (2) 

are highly non-linear closure laws. The natural phase space associated with 
such variables then reads: 

i? = {u = (p, pv, pY) G p > 0, u G M, Y G [0, 1]} . 

Considering the pressure p, we will consider a perfect gas and a compress- 
ible liquid. The pressure law can be put in the general form p = p(p, pY) and 
is a smooth function. We consider a general algebraic hydrodynamic law of the 
type 

^ V u G C, (3) 

in order to close (1). In (3), the mapping ^ is assumed to be smooth enough. 
In practical situations, ^ turns out to be nonlinear in the unknown u (see [4], 
[12] for instance). 

1.2 Boundary and initial conditions 

At the inlet of the pipe (x == 0), the mass flowrates are given as functions of 
time, i.e. 

(pw)„(0, t) = ql(t), t>0, a = L,G, 

where and {pv)h (resp. (pu)g) denotes the mass flux of the liquid (resp. the 
gas). At the outlet of the pipe {x = L, the length of the pipeline), the pressure 
is a given function of time, i.e. 

p{L, t) =p^{t), t > 0. 

We will treat cases in which the flow is induced only by variations of the 
boundary conditions: in such experiments, the initial condition is the steady 
state, computed by the values of the boundary conditions at time t = 0. 

1.3 Characteristics of the model 

There is no analytical expression for the physical flux of the considered sys- 
tem, except for very simple hydrodynamic laws. Therefore the eigenvalues of 
the system (1) are not known in full generality. However, in most common 
situations, i.e., for usual values of u, the system is hyperbolic and has three 
real eigenvalues Ai < A2 < A3 with Ai < 0 and A3 > 0. 

The interesting property of the system is that the eigenvalues A 1^3 cor- 
respond to the acoustic waves (or “pressure waves”) and propagate at fast 
speeds thanks to the compressibility of the fluid. The eigenvalue A2, which has 
a variable sign, corresponds to the kinematic waves and propagates at slow 
speeds with the fluid. These properties imply that the large eigenvalues are 
10-100 times bigger than the small eigenvalue, i.e. |Ai^3| IA2I. 
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2 Explicit relaxation scheme 



2.1 The relaxation model 



We relax all the genuine non-linearities of the equilibrium system (1) which 
appear by means of a Lagrangian change of coordinates. Then, we introduce 
two new state variables E and U which are intended to coincide respectively 
with (j(u) and P(u) in the limit of the relaxation parameter A. Finally, back in 
Eulerian coordinates, we propose as a relaxation model the following system: 



^ dtp + 

dt{pv) + 

< dt {pn) + 

dtipY) + 

[ dt (pE) + 



dx {pv) = 

dx {pv^ Hh iT) - 

dx (pllv + a‘^v) = 
dx {pYv -E) = 

dx {pEv - 9Y) = 



0 , 

5(u), 

Ap(P(u)-i7), 

0 , 

Ap(cr(u) - E), 



(4) 



where a and h are two real positive parameters that we call “relaxation coef- 
ficients”. The above relaxation system will be given hereafter the convenient 
abstract form: 



dt^ + dxQ{^) = AP(v) + <S(v), t > 0, X eR] (5) 

where the functions 7Z and <S receive clear definitions. 

The first order system with no source term extracted from (4) admits five 
real eigenvalues: u, u ± ar, u ± 6r, (r = 1/p) and five linearly independent 
corresponding eigenvectors. Consequently, the first order system with no source 
term extracted from (4) is hyperbolic. Moreover, each eigenvalue is associated 
with a linearly degenerated field. 

Because of the specific values of the functions P and cr, we always have 
a > b. This is in harmony with the fact that a is associated with pressure 
waves and b with kinematic waves. 



2.2 First order explicit scheme 

The pipeline is made of I cells, denoted {Mi)i=ij. Let Xi be the center of the 
cell and Ax its length. We also denote Xi^i /2 = {xi + Xj+i)/2 the interface 
between two cells, Xi /2 = 0 lYe inlet boundary interface and x/^ 1/2 = L 
the outlet boundary interface. Let At'^ = be the time step. Let 

u{xi, t^) be the discrete unknown. The numerical scheme is based on 
the following splitting method. 

1. Relaxation. We take A = 00 and solve the ODE system dtv = XlZ(y) by 
projecting the variables on the equilibrium variety i.e. we set Uf := P(u^) 
and I7f := f7«). 

2. Evolution. We take A = 0 and solve the system dpv -f dxQ{v) = <S(v) on 

one iteration in order to go to the time t = . 
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In the evolution step, we consider the system dtv ^dxQiy) = S{y) that we 
approximate by a classical Godunov finite volume scheme which is based on 

the update ~ ~ (^(^i+1/2) “ ^(^^-1/2))- Riemann solution 

v*^i/2 on the interface i 4- 1/2 is made of six constant states separated by five 
contact discontinuities (see [2] for details). It is easily computed and leads to 
a low cost numerical scheme. 

The time step of the relaxation explicit scheme is limited by a CFL condi- 
tion. 

2.3 Relaxation coefficients 

We present in this section the computation of the relaxation coefficients a and 
b. We consider one local couple 5 ^i+1/2) each interface in 

order to minimize the numerical dissipation. The relaxation coefficients are 
designed in order to ensure: 

1 . the stability of the first order asymptotic equilibrium system thanks to the 
Chapman-Enskog expansion 

n = p+ \-^ni+ i: = o-+ A-^^i + c>(a-2). (6) 

The Chapman-Enskog analysis justifies the following choice: 

tti+i/2 = A /max(A(UL),A(UR)), A(u) = -Pr(u) -h Py{u), , . 

h+1/2 = i/max(P(uL),P(uR)), P(u) = <7^{n). 

2. physical properties of the approximate solution. 

It is possible to compute the relaxation coefficients a and b in order to 
satisfy, at the discrete level, the following two basic physical properties: 
the positivity of the density and, above all, the maximum principle on the 
gas mass fraction, i.e., G [ 0 , 1 ]. This is done by taking a^_|_i/2 and 6^_^i/2 
large enough, which results in a increased, but well-adjusted, amount of 
numerical dissipation. 

Finally, on one interface Xi^i/2^ the relaxation coefficients are 
= niax(a^_^i/2, a^_^i/2) and bi^i/2 = niax(6i+i/2, ^i+1/2). 

2.4 Second order in space and time 

The scheme is extended to second-order accuracy in space by using the classi- 
cal MUSCL (Monotonic Upstream Scheme for Conservation Laws) technique 
([8], [ 9 ]). Instead of taking a constant approximate solution on each cell, we 
construct a linear approximation. The limited slopes are those of the “physical 
variable” (p, Y, v) rather than those of the conservative variable u [6]. We 
choose the classical minmod slope limiter. 

The scheme is extended to second-order accuracy in time thanks to 
a Runge-Kutta second-order procedure. 
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3 Semi-implicit relaxation scheme 

3.1 First order linearly implicit scheme 

The linearly implicit scheme is classically splitted into three steps. 

1 . In the physical step, one computes a “predictor” using the explicit scheme 

^ vD) + Zii” <S(v«). ( 8 ) 

Here H is the numerical flux, based on a Godunov method. 

2 . In the mathematical step, one solves the linear equation in Jv 

+ (3i6vi + 7 i+i< 5 v<+i = vf - v” (9) 

where a^-i, jSi and 714 . i are 5x5 matrices involving partial derivatives of 
the numerical flux. 

This consists, at each time step, in solving a linear system A Sv = b thanks 
to, for example, a Gauss method. The matrix A is a tridiagonal by block 
matrix with each block of size 5x5: the resulting matrix is a band matrix 
with 9 extra- diagonal terms. 

3. Finally, we can update the conservative variable of the cell i by = 

Now, the remaining question is: how to construct the matrices a, (3 and 
7 ? It is easy (see [ 6 ]) to compute these matrices when the explicit scheme 
is a Roe- type scheme. The main objective is therefore to put the relaxation 
explicit scheme under Roe’s form. Such a rewriting is based on a shock curve 
decomposition and is possible when the solution of the Riemann problem is 
made of shocks and contact discontinuities (which is the case of the relaxation 
system). This computation is detailed in [ 1 , 3]. 

3.2 First order semi- implicit relaxation scheme 

The relaxation scheme is constructed as to be: 

- linearly implicit on the fast waves of speed u ± ar ( “pressure waves” ) , 

- explicit on the slow waves of speed v ( “kinematic waves” ) and the associated 
relaxation waves of speed v ±br. 

Accordingly, when one computes the partial derivatives of the numerical flux, 
only the terms associated with the largest eigenvalues are kept : we nullify 
the entries of the diagonal matrix implied in the computation of Roe’s matrix. 
The eigenvalues of the Jacobian matrices are modified in the same way. The 
matrices a, j3 and 7 are then computed with these modified matrices. 
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Time step of the semi-implicit relaxation scheme At each time step, 
one computes two auxiliary time steps. The explicit time step is computed 
on the base of the small eigenvalues of the relaxation system with a CFL 
number of 0.5. The linearly implicit time step is computed on the base of 
the large eigenvalues of the relaxation system with a CFL number of 20. The 
final time step is the minimum of the two last time steps. The CFL number 
20, experimentally determined, turns out to be a good compromise between a 
small CPU-time and an acceptable amount of smearing out for the numerical 
solution. 

Implicit projection We follow the ideas of Chalons [5]. The use of a linearly 
implicit scheme implies that the projection step is giving steady states that 
are not accurate. The solution is to link the evolutions of the relaxed variables 
associated with an implicit field to the evolutions of the equilibrium variables. 
This modifies the linear system to be solved in the mathematical step. It 
enables us to reduce the size of each block from 5 x 5 to 4 x 4 and therefore 
reduce the size of the global linear system. Numerical experiments shows that 
not only the steady solutions are more accurate (as already shown in [5]), but 
even transient solutions are more accurate. 

3.3 Second order semi-implicit scheme 

The slope limiters used to build the second order explicit scheme leads to a non 
differentiable expression. That is why we choose a simplified version in which 
we do not differentiate the nonlinear operator involved in the second order 
correction. In the physical step, we first compute the second order correction 
of the states ul,r and evaluate the numerical flux. In the mathematical step, 
we solve the first order linear system but the derivatives are computed with 
the corrected states ul,r. 

In order to have a second order accuracy in time, we use again the Runge- 
Kutta 2 procedure. Since the scheme is only semi-implicit, the order 2 in time 
is only achieved on the explicit waves and the linearly implicit waves are solved 
with order 1 in time. 



4 Numerical results 

In this Section, we show the numerical results of the semi-implicit relaxation 
scheme: we consider a real-life problem in which the solution is driven by the 
changes of the boundary conditions. 

The details of this test-case are shown in figure (1) and the results are given 
in figures (2). 

In this experiment [6], the inlet gas flow rate is decreased from 0.114 to 0 
kg/s. As the mass flowrates are small, the decrease in the inlet gas mass flow 
rate gives rise to negative oil velocities in the upper part of the riser. Therefore, 
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Extracted from : [6] 




Geometry of the pipe and discretization : 


- Length : 80 m, vertical 

- Diameter : 0.146 m 

- Cell size : 1.6 m 
Closure laws : 




- Thermodynamic : compressible liquid, perfect gas 

- Hydrodynamic : like Zuber-Findlay 


Source terms ; Friction-f- Gravity. 
Boundary and initial conditions : 

- Stabilization period 


10 s 


- Transient period 


100 s 


- Inlet condition on gas flowrate 


0.114 to 0. kg/s 


- Inlet condition on liquid flowrate 


1.628 kg/s 


- Outlet condition on pressure 


10® Pa 



Fig. 1. Detail of the experiment, Zuber-Findlay-like law with boundary conditions 



a void fraction between a “single-phase gas” state and a “two-phase” state 
propagates from the outlet down the pipe. Simultaneously, the change in the 
inlet gas mass flowrates induces another void fraction front which propagates 
from the inlet up the pipeline. These two fronts meet around x = 47 m at 
time t = 260 s to form a unique discontinuity wave which propagates toward 
the outlet of the pipe. At time t = 200 s, this discontinuity wave reaches the 
outlet and the pipe turns to a single-phase liquid steady state. 

This test-case is quite stiff. During the simulation, the scheme must han- 
dle two-phase states, liquid state and gas state. Moreover, the discontinuity 
propagating at the end on the simulation is between two one-phase states. 
The classical VFRoe scheme is not enough rough in order to handle this case 
and that is why the authors of [6] introduced more numerical diffusion in 
their VFRoe-TACITE scheme. The numerical results show a good agreement 
between the two schemes. 




Time (s) Time (s) 



Fig. 2. Experiment 4, gas surface fraction. Left : TACITE results. Right : relaxation. 
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The difference between the two schemes is mainly the CPU cost of the 
two simulations. The TACITE scheme requires 22 minutes and the relaxation 
scheme requires 7 minutes. See [1] for a discussion on this point. 



Conclusion 

We have presented a second order semi-implicit relaxation scheme. The main 
difficulty was to express the relaxation Godunov scheme in Roe’s form, that is 
to say, to compute Roe’s matrix. The semi-implicit relaxation scheme is then 
explicit for the slow waves and linearly implicit for the fast waves and enable 
us to reduce the CPU time as well as to be more accurate on slow waves. 
Numerical experiments show a good agreement with a VFRoe-type scheme on 
realistic problems. 

The main open issue is the extension of this scheme to the compositional 
flows which have a very important role in petroleum industry. The energy 
equation should also be investigated in order to take care of thermal effects. 
These systems are explored in [1] . Another effort could be made in developing 
rough boundary conditions: the explicit relaxation only ensures the physical 
properties inside the domain and not at the boundaries. 
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Summary. In this work we simulate numerically the transport and biodegradation 
of an organic contaminant (BTEX) in the subsurface. A “real world” contaminated 
site is considered and realistic laboratory-derived or field-measured input parame- 
ters are used. The biodegradation of the dissolved contaminant plume is modelled by 
Monod type kinetics. For the computations we use an approximation scheme which 
is based on a higher order finite element method and the two step backward differ- 
entiation formula and was proposed and carefully analyzed by the author in a recent 
work [3]. It was successfully applied to test (benchmark) problems reported in the 
literature (cf. [10, 13]) as well as complex scenarios with an additional numerical 
computation of the flow field by solving the parabolic-elliptic degenerate Richards 
equation; cf. [3]. The higher order approximation scheme has shown to reduce sig- 
nificantly the amount of inherent numerical diffusion compared to lower order ones. 
Thereby an artificial transverse mixing of the species leading to a strong overestima- 
tion of the biodegradation process and wrong prediction is avoided. 



1 Introduction 

Groundwater contamination by biodegradable organic compounds has be- 
come a serious and widespread environmental problem in industrialized coun- 
tries. Major organic contaminants include petroleum fuels (gasoline, diesel), 
petroleum byproducts (coal tar, coal-tar creosote), and chlorinated solvents. In 
many cases, groundwater contains a mixture of organic contaminants, either 
due to the complex mixture in many non-aqueous phase liquids (NAPLs; e.g., 
gasoline) or due to co-disposal/co-spillage (e.g., landfill leachates). The degra- 
dation of these contaminants is controlled to a large extent by the biological 
and geochemical conditions in the groundwater. Fortunately, biodegradation 
tends to attenuate at least some organics during groundwater transport. 

The question of whether active remediation is required, or whether natural 
processes of attenuation (passive remediation) will be sufficient is a critical 
issue in “real world” situations. Passive or intrinsic remediation is generally 
preferred, if feasible, due to the potential to, firstly, eliminate permanently 
contaminants through biogeochemical transformation or mineralization and, 
secondly, avoid expensive biological, chemical and physical treatments. How- 
ever, the possible attenuation of organic compounds and the impact of that 
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contamination on a groundwater resource is difficult to predict since field sam- 
pling limitations make it difficult to develop an accurate mass balance. Numer- 
ical models can be used to help answer these questions, predict the long-term 
evaluation of the contaminant plume and evaluate factors limiting biodegra- 
dation. However, a predictive capability for decision-making can only be found 
in advanced contaminant transport models which include the full range of the 
controlling processes and efficient, accurate and reliable numerical methods for 
solving these equations. Although the understanding of conservative transport 
and the effect of medium heterogeneity on transport are now well- advanced, 
methods for modelling bioreactive processes, in particular at the field scale, 
are less well understood; cf. [6]. Also, the accuracy of numerical techniques in 
the context of bioreactive transport has been little explored so far. 

In a recent work [3], the author proposed a higher order approximation 
scheme (cf. Sec. 3), based on finite element methods and backward differentia- 
tion formulae, for biochemically reacting contaminant transport in the subsur- 
face with Monod type kinetics (cf. Sec. 2). The numerical scheme was carefully 
compared to a recently published adaptive finite volume approach (cf. [10, 13]) 
by recomputing some computational experiments of biodegradation processes 
presented in [10, 13]. The higher order finite element techniques seemingly pro- 
vided more accurate results than the finite volume methods; cf. [3]. Further, 
the simulations presented in [3] have clearly borne out that in order to en- 
sure the reliability of the numerical discretization of the bioreactive transport 
model, it is of importance to use higher order approximation schemes, in par- 
ticular, for the spatial discretization. Using lower order methods may lead to 
an overrepresent ation of the transverse mixing of the substances and, thereby, 
to a significant overprediction of the biodegradation process; cf. [6, 13]. Com- 
pletely wrong solutions are obtained, even if the spatial grid is locally refined 
and adapted to the solution. Higher order methods help to overcome these 
difficulties due to their less inherent numerical diffusion. 

In this work we use the approximation scheme suggested by the author in [3] 
to study the long-term evaluation and biodegradation of a “real world” field 
scale Benzene Toluene Ethylbenzene Xylene plume in the subsurface. The 
contaminated site is located in the north part of the city Geretsried in Germany 
close to Munich and was recently analyzed within the interdisciplinary net- 
work project “Sustainable Remediation involving Natural Attenuation” that 
was supported by the Bavarian State Ministry for Regional Development and 
Environmental Affairs; cf. Sec. 4 or contact [7] for further information. At this 
site, large quantities of mineral oil were infiltrated into the soil between 1948 
and 1989 by a chemical laundry. Despite the cleaning up performed in 2001, a 
significant concentration of BTEX is still being measured there. The plan for 
the paper is now as follows. In Sec. 2 we introduce the mathematical model 
describing the transport and Monod type biodegradation of organic contami- 
nants in the subsurface. In Sec. 3 the numerical discretization techniques are 
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briefly introduced. The simulated results for the expansion and movement of 
the contaminant plume are presented in Sec. 4. 



2 Governing equations 



Microbial activity in the subsurface is dependent on the bioavailability of all 
substrates utilized by the microorganisms. The main substrates are the elec- 
tron donor, the electron acceptor, and the primary carbon source. In the stan- 
dard case of metabolic aerobic degradation, oxygen is the electron acceptor, 
and the contaminant to be degraded acts both as the electron donor and the 
primary carbon source; cf. [5]. Here, for the sake of simplicity, only the basic 
process of aerobic degradation of a single substrate (xylene) is considered. The 
principles hold equally for multiple electron donors or acceptors. 

For this “simple” scenario, biomass growth is assumed to follow the double 
Monod kinetics, also referred to as the double Michaelis-Menten law; cf. [5, 6, 
15]. Then the governing equations for the electron donor cd [ML“^j, electron 
acceptor ca [ML~^] and immobile biomass cx [ML“^] are respectively given 

by 

dt{Oci) - V • {DiVci - qci) = -aiii , 

^tcx + kdcx = ^(\-^^)^i, 
for i — where the Monod term fi [ML~^ T“^j is defined by 



/i — 0/^max 



cp Kid Ca Kja 

Kp + Cp Kip -h cp Ka + ca Kia + ca 



( 2 ) 



Thus, we have to consider a coupled system of partial and ordinary differential 
equations. In (1), 0 [-] denotes the volumetric water content, q [LT“^] the 
Darcy velocity vector (volumetric flux) and i — [L^ T“^j with 



Di = {Gdi + Pt\q\)I + (A - Pt)[q 0 q)/\q\ (3) 

the dispersion tensor following the Scheidegger parametrization (cf. [14]) in 
which di, i = D, A^ [L^T“^] is the molecular diffusion, / [-] the identity ma- 
trix and A [L] and A [L] are the longitudinal and transverse dispersivities, 
respectively. We have ap = 1 [-]. The constant a a H denotes the electron 
acceptor to donor mass ratio, kd [T“^] is the first order decay rate for the 
biomass, Y [-] is the microbial yield coefficient per unit electron donor con- 
sumed (mg biomass per mg electron donor) and [M L“^] is the maximum 

biomass concentration. In (2), /imax [T~^] denotes the maximum growth rate, 
Ki, i = D,A, [ML“^] is the half-utilization constant of the electron donor and 
acceptor, respectively, and Kp, i — D, A, [ML“^] is the Haldane inhibition 
concentration of the electron donor and acceptor, respectively. The inhibition 
term Ku/{Kii-{-Ci), i = D, A, proposed by Haldane [8] and Andrews [1], yields 
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a slower microbial growth and, therefore, a slower effective electron donor uti- 
lization rate at higher concentrations; cf. [15]. In the numerical model, unre- 
alistically high biomass concentrations are avoided by introducing the term 
1 — cx/cxmax second of the equations (1). If microbial growth is not 

restricted in the model, simulated microbial concentrations may become very 
large, especially in source areas with continuous electron donor and acceptor 
supply. In real aquifers, the size of the biomass is limited, for example, due 
to a lack of available pore space, production of inhibitory metabolites, lack 
of nutrients and viral attack. The constant cx^ax represents the maximum 
microbial concentration at which the biomass reaches a quasi-steady state. 

We consider solving (1), (2) over (0,T) x i? where Q C d == 2,3, is a 
two- or three-dimensional bounded domain and the system (l)-(2) is supplied 
with initial conditions 

cd( 0, •) = CD,o , ca( 0, •) = CA,o , cx(0, •) = cx,o in i? at t = 0 , (4) 

and nonhomogeneous Dirichlet and Robin boundary conditions for i — 

A A 

Ci = Qi on (0, T) X Fd , [qci - DiVci) • u = hi on (0, T) x Fr . (5) 

Here, i/ denotes the outer unit normal to the boundary df2 — FrUFr of Q. The 
existence of a global unique non-negative solution cr^ca G Wp’^((0,T) x i7), 
with p > 2, and cx G C^([0, Tj; C(i7)) to (l)-(5) for any given T G (0, oo) 
was recently proved; cf. [12]. In particular, the non- negativeness of cr^ca^cx 
can be ensured. The proof can be carried over to d = 3. For the definition of 
W^^’^((0,T) X J?) we refer to [11]. 

In our computational experiment, the velocity vector q [LT~^] in (1), (3) 
and (5) is prescribed analytically. This is done due to a lack of information and 
measurements of the flow field for the considered site; cf. Sec. 4. For numerical 
simulations of contaminant transport and biodegradation scenarios where the 
flow field is additionally computed numerically by solving the parabolic-elliptic 
degenerate Richards equation we refer to [3]. 

The established regularity of solutions to problem (l)-(5) is, by far, too 
weak to justify the use of higher order approximation schemes. However, one 
may expect that a higher regularity of the solution still holds in some sense 
locally. This might be sufficient to get a significant advantage of higher order 
approximation schemes over lower order ones. Such superiority of the higher 
order methods was recently confirmed by numerical computations; cf. [3]. Fur- 
ther, higher order regularity results for solutions to equations (l)-(5) are es- 
tablished in a forthcoming paper; cf. [4]. 



3 Discretization and solution techniques 

We shall now briefly describe our numerical methods and solution techniques 
for solving the equations (l)-(5). For the spatial discretization we use con- 
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forming finite element methods. Here, for simplicity, we assume that i? C 
is a polygonal bounded domain. If either the whole boundary of i? or at least 
some part of it is curved, we adapt the mesh to the boundary by using the 
isoparametric counterparts of the finite elements introduced below; cf. [3] . In 
our computations we will consider non- vanishing Dirichlet boundary values gi, 
i = D, A, in (5). However, for simplicity, the variational formulation of (l)-(5) 
is given for homogeneous Dirichlet boundary conditions only. Nonhomogeneous 
boundary values are incorporated by standard techniques. 

Let Th = {K} be a finite decomposition of mesh size h of into triangles. 
The decompositions are assumed to be regular, i.e., “face to face”. We use 
standard conforming P 2 elements for the Monod model (l)-(5). The approxi- 
mation spaces Vh for the electron donor and acceptor Ci^ i = and Xh for 

the biomass Cx are thus defined as Vh — {Ci G C{Q) \ ^ ^2(^) for K G 

n and Xh = {Cx € 0(72) \Cxik& P 2 {K) for K e Th}. By 
j G N, we denote the space of all continuous polynomials of maximum degree 
j. Further, = {c G | c = 0 on IT)}. Hence, the spatial dis- 

cretization of CD, ca and cx converges formally of third order with respect 
to the norm in LP‘{Q). Advection-dominated transport of the mobile species 
(electron donor and acceptor) introducing local numerical instabilities in the 
solution can efficiently be captured by the streamline upwind Petrov-Galerkin 
method (SUPG); cf. [3, 9]. 

For the temporal discretization of problems (l)-(5) we consider a mesh 
{t^}, n — 0, . . . , A/* with to = 0 and tj\f — T^ for the time variable t and define 
'Tfi — tn +1 —tn- Due to the generally high stiffness of semidiscretizations to flow 
and transport problems, implicit schemes should be preferred in the choice 
of time-stepping methods for solving these problems. The backward Euler 
method is robust and has excellent stability properties, but it is inaccurate 
due to its first convergence order only and also strongly damping. So, it should 
only be used for nonstationary calculations which aim to iterate towards the 
steady limit. A scheme having similar stability properties as the backward 
Euler method but being of second order accuracy is the two step backward 
differentiation formula BDF 2 which we use in our computations. Further, to 
increase the efficiency of the calculations, an adaptive time stepping procedure 
was developed and tested in [3] for the proposed discretization of the transport 
and biodegradation model (l)-(5). 

Now, we suppose that sequences {q{tn),0{tn)} G are 

explicitly prescribed. Let Pz, denote the L^-projection onto the finite element 
space Zh. The discretization of the Monod model (l)-(5) by the Galerkin 
method and the two step backward differentiation formula BDF 2 then reads 
as follows: 

Set = Py^Ci^Q and = Pxh^x,o- For all time steps n == 0, . . . , A/* — 2 
compute approximations ^Vh, i — D^A, and € Xh hy solving the 

equations 
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7„+2 (0(t„+2)Cf+2, Vi) - 7„+l (0(Wi)^r+S Vi) + 7n mn)C^, Vi) 
+r„+i(g(i„+2) • VC”+2, + r„+i(A(tn+2)VC”+2, V^i) 

-r„+i((9 • ;/)Cr+2, + r„+i(V • 

= -Tn+l{ai U^+'^,Vi) - Tn+l{hi,Vi) n, , 



for all Vi ^Vh, i = A, and 

7„+2 - 7n+l + 7n + r„+l/Cd = 

y 



Tn+1 ^ 



6>(tn+2) 



^n+2 \ 

1-^ 



(7) 



/or all nodes {xj)j=i^,,,^M associated with degrees of freedom of , where 



= 

^(^n+ 2 ) Mmax 



Kid C2ff_ K i^ 

i^/D+C'S+" KA-hC^-^^ Kia+C^-^^ 



( 8 ) 



By (•,•) and {'•>') Tr we denote the standard inner product in L^(i?) 
and L‘^{rR), respectively. Further, in (6) and (7) we use the notation 7 n +2 — 

1 + ^n+l/(Tn+l + Tn), 7n+l == 1 + Tn+l/Tn and 7n = T^^i/ {{Tn+l + Tn)Tn). 

The time step sizes Tn+i can be chosen adaptively; cf. [3]. Clearly, identity 
(7) formulates a pointwise condition for all nodes associated with degrees of 
freedom of Using instead of (7) a variational equation, analogously to 

(6), leads to stability problems and severe oscillations. 

As usual, for the test functions Vi we choose the basis functions of V^. 
Further, let € Xh, i = n,n + l,n + 2, be represented in terms of the finite 
element basis functions {Wj}jLi of X/^, i.e., Xh = span{lTj | 1 < / < M} and 
C^x = for 2 = n, n + 1, n + 2, where the vector . . . , 

denotes the degrees of freedom of Then, (7) amounts to solving the system 
of equations in the unknown vector 



7n+2^' 



72-j-2 



7n+l + 7n ^" + Tn+lkd C^'^=Tn+l 



0{tn+2) 




^«+2 



where U^+‘^ = {U^+^{xi), . . . ,U^+‘^{xm)). 

Since the BDF 2 is a two step method, we need a starting procedure to 
compute appropriate approximations C^, C\ and of CD{ti, •), •) and 

cx(U? O 5 respectively. Here, the first time step is done by performing M sub- 
steps of the backward Euler method with step size tq/M. In our computations 
we use M = 4. To solve the resulting nonlinear systems of equations, a damped 
version of Newton’s method is applied. The linear problems of the Newton iter- 
ation are solved by standard Krylov space methods like GMRES, for instance, 
with SSOR preconditioning. For the future we plan to use multigrid methods 
for the linear solver which is motivated by our former experiences with com- 
puting variably saturated subsurface flow; cf. [2] . In the simple case of a single 
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Fig. 1. Computational domain (left) for contaminated site Geretsried and initial 
concentration of electron donor xylene (middle) and electron acceptor oxygen (right) 



electron donor and acceptor and a single biomass an alternative treatment of 
the transport and biodegradation problem (l)-(5) seems to be possible. After a 
temporal discretization of (1), one may resolve the time-discrete version of the 
second of the equations (1) for the biomass concentration cx and substitute 
the resulting identity into the time-discrete version of the first of the equations 
(1). Thus, the biomass concentration is eliminated from the nonlinear system 
of equations and can be computed in a postprocessing procedure which leads 
to smaller systems of linear equations to be solved. However, a generalization 
of such approach to the case of multiple microbial populations and, in partic- 
ular, its implementation seems to be more complex than an explicit treatment 
of the ordinary differential equations for the biomass. Therefore, the approach 
is not considered here. 



4 Computational results 

We shall now present our computational results obtained for a “real world” 
residual waste and thereby provide valuable insights into the complex interac- 
tions of biological, chemical and physical processes that are involved in natural 
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attenuation phenomena. The contaminated site is located in the city Geretsried 
in Bavaria (Germany) and was recently analyzed within a network project; 
contact [7] for details. Large quantities of mineral oil were infiltrated into the 
soil between 1948 and 1989 by a chemical laundry. Despite some active reme- 
diation, a significant concentration of BTEX is still being measured there. 

The computational domain fl in the subsurface is visualized in Fig. 1 with 
length given in meter. For simplicity and due to a lack of information, a con- 
stant groundwater flow field parallel to the left and right boundary of i? is 
supposed. The flow direction is from the lower to the upper boundary ad- 
jacent to the river Isar. The measured concentration profiles at the current 
state, assumed to be the initial state of the simulation, are shown in Fig. 1. 
The shapes within the profiles were slightly idealized. We restrict ourselves to 
the electron donor xylene which here is the main component of BTEX with 
70-90%. The xylene concentration inside the ellipse is 12.5 [mg/1] and 0 [mg/1] 
elsewhere. Biodegradation of xylene with either oxygen or nitrate as electron 
acceptor is observed. Here, we consider the first case. We have a small rectangu- 
lar domain overlapping with the xylene ellipse where the oxygen concentration 
is 2 [mg/1] and 4.0 [mg/1] in its center. The ambient oxygen concentration is 
8.6 [mg/1]. The measured higher oxygen concentration inside the contaminant 
source puzzles us and has not been completely understood yet. One reason 
for that might be a lower permeability inside the contaminant source due to 
a lack of available pore space. Further investigations are necessary. Therefore 
and due to the lack of information in particular about the spatial variation 
of the flow field, our simulations have to be considered rather as a qualitative 
analysis of the biodegradation process than as a quantitative prediction of the 
xylene degradation. Nevertheless they contribute to a better understanding of 
natural attenuation phenomena for this site. 

In our calculations we used reliable field-measured and laboratory-derived 
input parameters that were given in [15]. They proved to describe adequately 
field scale degradation provided that all controlling factors are incorporated 
in the field scale model. The flow field q = (0.045,0.15)^, with time given in 
days, and the diffusion-dispersion parameters were obtained by measurements. 
Precisely, we put G\ 0.3, d^^dA’- 8.64e— 5, /St’. 2.0, /3i'. 10.2, : 3.16, kd\ 

0.001, F: 0.52, cx_: 1.0, 4.13, Kd: 0.79, 0.1, Kid: 91.7 J<ia: 

oo. The initial concentration of the biomass was cx(0, •) = 0.003 in i7. We 
chose homogeneous Neumann boundary conditions at the left, right and upper 
boundary and a Dirichlet condition at the lower one. The computations were 
done on an almost uniform grid with 12322 elements. The time step sizes were 
chosen adaptively; cf. [3] . The calculated concentration profiles of the electron 
donor (contaminant) xylene, acceptor oxygen and biomass are visualized in 
Fig. 2 to 4. For comparison, the problem was recomputed on a very fine mesh 
with 197152 elements. This was done on a Linux cluster with 16 processors. 
No significant changes in the computed profiles were observed. It shows that 
the proposed numerical scheme reliably predicts the degradation rates even on 
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Fig. 2. Concentration of electron donor xylene at T = 20 days, 1.5 years and 4.5 
years 



relatively coarse meshes which is in agreement with the results presented in 
[3], 

Fig. 2 to 4 show that the contaminant is transported by the flow field to the 
upper boundary of Q adjacent to the river. Simultaneously, it is degraded by 
a reaction between electron donor, acceptor and biomass. The initial oxygen 
concentration inside the contaminant plume becomes depleted within a few 
days which is not consistent with the measured profile and might result from 
an insufficient description of the flow and permeability conditions inside the 
contaminant source. The reaction between the species is restricted to those 
regions where their concentrations are sufficiently large. Basically, it is the 
interface between the electron donor and the surrounding region where still 
enough acceptor is available. If a numerical method with much artificial dif- 
fusion is used, this interface between the species smears out and the reaction 
takes place in the larger region. Then, the contaminant is degraded too fast; 
cf. [3, 6, 10, 13]. In particular, this happens if lower order methods are applied 
on not highly refined meshes; cf. [3, 13]. 
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Summary. As an alternative to classical stabilization schemes as, for instance, 
Galerkin-Least-Squares or streamline diffusion techniques, a stable equal-order fi- 
nite element scheme for the Navier-Stokes equation is proposed. The approach is 
based on filtering small-scale fluctuations of pressure and velocities by local pro- 
jections. For the Stokes system, we prove stability and analyze the arising system 
matrix. Furthermore, the transport equation is analyzed with respect to stability and 
an a-priori estimate is given. 



1 Introduction 

In this note we present a discretization of the stationary Navier-Stokes equa- 
tions based on equal-order finite elements. We combine the pressure stabiliza- 
tion for the Stokes equations developed in [2] with a similar technique for the 
nonlinear convection term. The entire approach is based on the use of two 
discrete spaces W 2 / 1 , We use finite elements on quadrilateral meshes. The 
discrete space Vh corresponds to bilinear finite element functions and W 2 h to 
piecewise constant elements on a globally coarser mesh. 

Although stabilization by weighted least-squares terms, as for instance 
Galerkin-Least-Squares (GLS) or streamline-upwind Petrov-Galerkin 
(SUPG), see [10, 9, 6, 14], is now classical and provides a rather general frame- 
work, there is a certain need for different stabilization techniques, see the more 
recent approaches [4, 8, 2, 5]. One common feature of the new approaches is 
that they have better local conservation properties then the classical ones. 
Further, the difficulties of SUPG for higher order polynomials might be over- 
come. The choice of the stabilization parameter does no longer depend on the 
constants of an inverse estimate, see [2]. 

Our motivation for development of new stabilized schemes comes from two 
fields of application: a) reacting flows with complex chemistry and b) optimal 
control of incompressible flows. In the first case, SUPG-like stabilization leads 
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to enormous coupling terms which are rather difficult and time-consuming to 
compute, see [3]. Further, the design of robust Newton-type methods in this 
context is not evident since one has to decide whether or not to take into 
account the derivative of each stabilization term in the approximate Jacobian. 

Related to the just mentioned problem of computing the derivatives of the 
stabilization terms is the discretization of optimal control problems. Here, the 
critical terms directly influence the quality of the computed gradients of the 
cost functional, see [1] for a discussion. 

In Section 2, we describe the two- level discretization of the stationary 
Navier- Stokes equations. In the following sections, we present some aspects 
of the analysis: Section 3 deals with the arising algebraic system for the Stokes 
equations and Section 4 with stabilization of the convective terms. 



2 Two-level scheme for the Navier-Stokes equations 

Let i? C be a polygonal domain. We want to solve the Navier-Stokes 
equations for an incompressible fluid with fluid velocity v and pressure p, 

{v'V)v — uAv + Vp = f in i? , (1) 

divu = 0 in 1?, (2) 

supplied with homogeneous Dirichlet boundary conditions: 

= 0 on dO, (3) 

and the normalization of the pressure 

pdx = 0 . (4) 

Jq 

In (1), / G represents given data. For the weak formulation of (l)-(4) 

we introduce the following notations: 

u := (p,u) G X := L^{n)/R x , 

a{u){(j)) :— ((^’•V)u,'0) + z/(VuV'0) — (p, div-0) -h (divt^,^) , (5) 

for test functions 0 = ('0,0 ^ Now the weak formulation of (l)-(4) reads 
in compact notation: 

= V0GX. (6) 

We denote by 7^ a shape regular partition of the domain into quadrilaterals. 
Hanging nodes are allowed with moderation for ease of local mesh reflnement. 
We consider two flnite element spaces which are constructed in the 

following way. 

Vh consists of bilinear flnite elements on Th. On hanging nodes, the flnite 
element functions are interpolated by the neighbor nodes so that no degrees 
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of freedom are present on those irregular nodes. The space TL 2/1 consists of 
constants on each cell of a coarser mesh T^h obtained by one global coarsening 
of Th : Each quadrilateral K G T 2 h is cut into four new quadrilaterals (dividing 
all lengths of edges of K by 2) in order to obtain the fine partition 7^. Note, 
that the functions in W 2 h are discontinuous across edges of elements of T 2 h • We 
indicate the subspaces of discrete functions respecting the Dirichlet condition 
by an additional subscript C Vh. The restriction to a mean value of zero, 
cf. (4), is denoted by The discrete solution Uh — {vh^Ph) is searched in the 
discrete space Xh := x V^q- 

Furthermore, we use the L^-pr ejections on the piecewise constants Pw • 
L’^{Q) — > W 2 h and the fluctuation operator ’ Vh Vh- 

TTh '■= I — Pw, (7) 

where I denotes the identity mapping. 

We use the following stabilization terms, defined on Xh x Xh- 



s{u){(j)) := (7T/^[(u-V)u], ^7T^[(u-V)'0]) 4- (vT/iVp, aTThX^). (8) 



Here, a and 6 denote piecewise constant functions which depend among other 
things on the local cell size hx- The precise definition is: 



I s :\ • f 

a\K '-= , S\k mm — , 

V ^ \M\oc,K 

The discrete problem reads: Find Uh G Xh such that 

a{uh){(t>h) + s{uh){(t)h) = (/, G Xh- 



(9) 



(10) 



One remarkable feature of (10) is, that the stabilization terms only act 
on the diagonal of the coupled system. The structure of the stabilization is 
unchanged, if additional lower order terms are added to the equations. 

Our numerical experience shows that the resulting scheme has very similar 
properties to SUPG concerning stability and accuracy. In the following we 
present some aspects of the analysis of (10). 



3 Structure of the system matrix for the Stokes equations 

As the first step, we consider the proposed stabilization in the case of the Stokes 
equations. It can be easily seen that the stabilization term {jThVph, 

(8) leads to a larger stencil for the pressure then the original one coming from 
the Galerkin part. The discrete problem for the Stokes equations reads: Find 
Uh G Xh such that 

^(Ph,4^h) 4 " ShUh,4^h^ — {f , 4^h^ ^4^h ^ ^h (^^) 



holds, where now 
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a{u,(j)) jy{Vv,V'ip) — {p^div'ip) + (divi;,^), (12) 

s{u,(l)) := {iThVp, aiThV^). (13) 

The analysis presented in [2] gives us the following error estimate in terms 
of the norm || • || and Sobolev norms || • \\H^(n)‘ 

Theorem 1. Letp G andv G iJ^(i?)^. Then, ifVh consists of isopara- 

metric biquadratic functions, the solution (ph^Vh) of { 10 ) for the bilinear forms 
(12), (13) allows for the following error estimate: 

\\p - Ph\\ + ||V(v - < Ch^{\\p\\H 2 ^n) + l|w||H3(f2)) , (14) 

where h denotes the maximal mesh size. 

A similar error estimate holds true for the classical Taylor-Hood element 
with biquadratic velocities and bilinear pressure (which does not require any 
stabilization), see [7]. One might therefore wonder what the additional pressure 
degrees of freedom in our scheme produce. The answer to this question is 
provided in the following. 

We denote by G IR?^ the vector of coefficients of the function Vh with 
respect to the canonical finite element nodal basis i.e., 

Now, we split the pressure ph into a coarse grid part p^ and small-scale fluc- 
tuations p'l^: 

Ph=Ph+ p'h^ Ph e V 2 h, Ph e K := '^h \ V 2 h. (15) 

The coefficient vectors p^ and are defined analogously by the nodal basis. 
Then, the matrix representation of the linear system (11) reads 

’A-(By-{B')*] \fh 

B Si ^ = 0 

_B' S 2 S 3 _ _ 0 ,_ 

where fh has the obvious meaning. The matrix A stands for the Laplacian, B 
and B' for the divergence, and S for the stabilization. 

Block elimination of the pressure component p'j^ leads to: 

'a -{B)*] [uJ ^ \fh' 

B S \ [fh\ [OJ ’ 

where A = A + D, D ^ -{B')* S^^B' , 

S = Si- S*Sy^S 2 , 

B = B-S;S^^B' . 

We provide the following result as an interpretation of our findings. 

Remark 1 . The additional diagonal block D in the system matrix is suspected 
to act as an additional stabilization term controlling the discrete divergence. 




A Two-Level Stabilization Scheme for the Navier-Stokes Equations 127 



We denote the I 2 scalar product by (•,•). On quasi-regular meshes where all 
cells are parallelograms, we have a mesh-size independent constant c > 0 so 
that . 

- <{Dvh,Vh) <C lldivt^fcll^. (16) 

Ken Ken 

Such a term is of common use in stabilized schemes, see [13]. 

Proof. By definition of D it holds: 

{Dvh,Vh) = {S^^B'vh,B'vh) ■ 

We denote by // the index set of fine grid nodes A/i, and by G Vh the 
standard nodal hat functions of node Afi with support Pi. It holds 

{B'vh,B'vh) = WB'vhf = 

ieif 

i^If K^"Th 

Since scales like we get {Dvh^Vh) ^ proof for 

the opposite direction 

c ^ ' 

KeTh 

will be given in [12]. 



4 Analysis for a transport equation 

We consider the following transport equation with a given constant transport 
vector /? G and given continuous data / and g: 

u {(5 -V)u = f in 12, ^ on T-, (17) 

where T- is the inflow part of the boundary: 

P- {x G df? : n{x) • /? < 0} . 

Denoting the L^-scalar product on the boundary dQ by (*,*), we define the 
following bilinear form and linear functional: 

a{u, 4>) := {u, 4>) + ((/3- V)m, </>) - ((/? • n)_u, 0) 

K4>) ■= if, 4>) - iW ■ '^)-9, </’)• 

Here we have used the notation x- := min(x, 0) and ^ is a prolongation of g to 
the whole boundary. Later on, we will use the notation x^ := max(x, 0). Then, 
the continuous solution u of equation (17) satisfies the variational equation: 
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V0gF, (18) 

where, for instance, V = The discretization of (17) to be considered 

here is based on the stabilized version of (18) using the space of continuous 
piecewise bilinear finite elements Vh as before: 

a{uh, (!>)-{- s{uh, 4^) V(/) G (19) 

The stabilization term is defined similar to one part used for the Navier- Stokes 
equations (8): 

siuhA) := {7Th[{p-V)uh], (57T/,[(/3-V) (/>]). 

From the following stability result we obtain existence and uniqueness of the 
discrete solution: 

Lemma 1. We have 



a{uh,Uh) + s{uh,Uh) = lllw^lll^, (20) 

with 

/ 1 , \ 1/2 

IIKIII := (Kf + . (21) 

Proof. Follows from integration by parts. 

In the following we give an error estimate. The proof is very similar to the 
classical one in [11] for the same equation supplied with SUPG stabilization. 
However, notice that in contrast to the proof for SUPG, we only have control 
over the streamline derivative of the fluctuations 7T/i[(/ 3* V)u/i] in (21). 

Theorem 2. Let u be the continuous solution to (17) satisfying u G 77^(17) 
and Uh the discrete solution of {19). Then we have the following estimate: 

11'^^ - Uh\\ < G/i^/^||n||iy2(^). (22) 

The estimate is similar to the standard estimate for SUPG or the discontinuous 
Galerkin method, [11]. With respect to the interpolation error we loose a power 
of 1/2. 

Proof. By jh '■ V Vh denote the modified Scott-Zhang interpolation 
operator introduced in [2] which has the following orthogonality property: 

{u - jhu, (j)) ==0, V(/> G W 2 h , (23) 

and allows for optimal interpolation in L^(i?) and H^{Q). That is, there exist 
a constant C such that 



\\V{u-jhu)\\ < Ch\\u\\H 2 (f 2 ). 
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We split the error a,s u — Uh == rj — ^ Uh — jhU^ rj := u — jhU. Due to this 

interpolation result, it is sufficient to show 

m<Ch^/^u\\H2^n) (24) 

with (another) constant C for proving the assertion. We have: 

III^IlP = a{uh,i) + s{uh,i) -a{jhU,C) ~ s{jhU,C) 

= K0- - s{jhu,^) 

= a{u, - a{jhu, 0 - s{jhu, 

= a{r],0 - s[jhu,0 

= (^,0 + ((/3-V)?7,0 - iW ■ - s{jhU,^). 

The only critical terms are ((/3-V)?7, and s{jhU, ^). We use partial integration 
to obtain 

((/?• V)7?, 0 = -iv, (/?• V)0 + ((/? • n)rj, 0 

= -(»?,7r/i(/3'V)0 + ((/3-n)77,0. 

In the last line, we have used the orthogonality property (23) of the inter- 
polation operator jh^ Furthermore, the stabilization term can be bounded as 
follows: 

|s(ihW,0l < Y1 ■V)jhU\\K\/^\\Trh{f3 ■V)^\\k 

K€T2h 

\KeT2h ) 

Here, the last line is obtained by stability of the 1? projection Pw and the 
interpolation property of jh’. 

\\T^hW ■V)jhU\\K < Y2 i\\^hil3 ■y)u\\K + \\T^h{/3 ■V){jhU-u)\\K) 

K&T2H 

<h\\u\\H^O)+ E m-'^)UkU-u)\\K + \\Pw{(i-'^){jhU-u)\\K) 
KeT2h 

< Ch\\u\\H2(^Q) . 

Now we get: 

lll^lll^ = (»7,0 - iv,'!^h[{l3-V)^]) + {{l3-n)+ri,^)-s{jhu,0 

< ll^li ll ^ll + llV ^^II l|V5^-V)^]|| 

+\WiP-n)+ 7?||a^2||\/(/3•n)+ 

< (Ihll + llv^^ll + WVW^vWan + Ch^/^uy^n)) IH^III , 
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and hence by the trace theorem: 

infill < llr/ll + 

< Ch~^/^\\rj\\ + Ch^^‘^\\u\\H2(n) 
which shows (24) since ||^|| < |||^|||. 
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Summary. In this paper we present an a posteriori error estimator for parameter 
identification problems governed by partial differential equations. This estimator 
aims to control the error in parameters due to the discretization by finite elements. 
It is used in an adaptive mesh refinement algorithm generating a sequence of locally 
refined meshes for efficient computation of the parameters. Comparison with some 
heuristic mesh refinement algorithms is done for a simple example inverse problem. 



1 Introduction 

We consider parameter identification problems involving a finite number of 
unknown parameters in the following form: The state variable u in an appro- 
priate Hilbert space V is determined by a partial differential equation {state 
equation) in weak form: 

a{u,q){cl>) = f{^) ycfeV, (1) 

where q £ Q = denotes the unknown parameters. The form a is defined 
on the Hilbert space V x Q x V and the linear functional f £V' represents 
the right hand side of the state equation, where V' denotes the dual space of 
V. Further, we are given an observation operator C : V Z, which maps 
the state variable u to the space of measurements Z = R’^’^ , where we assume 
^ We denote by (•, -)z the scalar product of Z and by \\ • \\z the cor- 
responding norm. Similar notations are used for the scalar product and norm 
in the space Q. 

The values of the parameters are estimated from a given set of measure- 
ments C G Z using a least squares approach such that we obtain a constrained 
optimization problem with the cost functional J : V ^ R: 

Minimize J{u) := ^\\C{u) — C\\% 



( 2 ) 
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under the constraint (1). Here, the cost functional is the squared norm of the 
residual defined by 

R^^{u) :=C- C{u). (3) 

The state equation is discretized by conforming finite elements on a regular 
mesh 7^, resulting in a finite element space C see e.g. Ciarlet [8] for the 
standard construction. In order to ease mesh refinement, the cells are allowed 
to have nodes, which lie on midpoints of faces of neighboring cells. But at most 
one such hanging node is permitted for each face, see Carey & Oden [7] for 
implementation details. 

The discrete state Uh G and parameter Qh ^ Q are determined by: 

Minimize J{uh) (4) 



under the constraint 



a{uh, qh)(M = fiM G Vh. (5) 

Due to the finite dimension of Q, we suppose the parameter qh in (4) to be 
sought in the space Q. 

The paper is organized as follows: In the next section we describe an opti- 
mization algorithm for solving the problem (4, 5) on a fixed mesh 7^. Section 3 
is devoted to a posteriori error estimation. Here, we present an a posteriori 
error estimator for the error in parameter E[q) — E{qh) for a given error func- 
tional E : Q R. This error estimator is developed in [5]. It is based on 
the optimal control approach to a posteriori error estimation from Becker Sz 
Rannacher [4]. However, a direct application of the techniques described in [4] 
leads to an estimator which controls the error in the cost functional J (2). In 
general, such an estimator does not provide useful error bounds for the pa- 
rameters, in contrast to the approach presented here. In Section 4 we discuss a 
numerical example illustrating the usage of the error estimator. The presented 
approach is compared with some heuristic methods with respect to the quality 
of generated meshes. Conclusions are given in the last section. 



2 Optimization algorithm 

In this section we discuss an optimization algorithm for solving the prob- 
lem (4, 5) on a fixed mesh 7^. 

Under the assumption of regularity of the partial derivative a!^, the implicit 
function theorem in Banach spaces implies the existence of an open set Qo C Q, 
containing the optimal parameter g, and a continuously differentiable solution 
operator S : Qo V, q S{q), so that (1) is fulfilled for u = S{q). This 
allows us to reformulate the problem (1, 2) as an unconstrained optimization 
problem: 
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Minimize j{q) := l||c(g) - C\\%, qeQ, (6) 

where the reduced observation operator c is given by c{q) = C{S(q)). Denoting 
hy G = c' (q) the Jacobian matrix of the reduced observation operator c, the 
first-order necessary condition j'{q) — 0 for (6) reads: 

G*{c{q)-C)=Q, (7) 

where G* denotes the transpose of G. The Jacobian matrix G can be obtained 
using tangent solutions Wj G V determined by: 

au{u,q){wj,4>) = -o,g.{u,q){l,4>) j = l...np. (8) 

Then, one simply proves that the entries of the matrix G are given by Gij = 

Similarly to the continuous case, we introduce a discrete solution operator 
Sh ’• Q ^ Vh foi the discretized state equation (5) and obtain an unconstrained 
formulation of the discretized problem (4, 5) by: 

Minimize jh{qh) ■= ^\\ch{qh) ~ qh&Q, (9) 

where Ch is the discrete reduced observation operator defined by Ch{qh) — 
C{Sh{qh))- The corresponding Jacobian matrix Gh can be computed similarly 
to the continuous case using the discrete tangent solution Wj^h ^ determined 
by the discrete version of (8). 

The problem (9) is solved iteratively starting with an initial guess q^ and 
using the recursive setting q^'^^ = Qh The update Sqh is obtained using 

a symmetric approximation Hk of the hessian as the solution of the 

system of linear equations: 

Hk5qH = Gl{C-CH{ql)), (10) 

where Gh — c'^{qh)- The most widely used choice of the matrix Hk = G"^ Gh 
leads to the Gauss-Newton algorithm, see e.g. Nocedal Sz Wright [10]. 

For one step of the Gauss-Newton algorithm the state equation and rip 
tangent problems (8) have to be solved which originate from the same linear 
operator but with different right-hand sides. Due to the small dimension rip of 
the parameter space Q the solution of (10) is uncritical. 



3 A posteriori error estimation 

In this section we present an error estimator for the error with respect to a 
given error functional : Q — > R. The precise error representation is given in 
the following theorem. Here, we use an interpolation operator • T” — > 
see e.g. Clement [9]. 
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Theorem 1. Let {u^q) he a solution of the parameter identification prob- 
lem (1, 2) and {uh, Qh) the corresponding discrete solution of the problem [4, 5). 
Then, for a given error functional E, we have that: 

E{q) - E{qh) = ^p{uh){y - ihv) + ^P*{uh, yh)(u - ihu) + P R, ( 11 ) 
where y ^ V is the solution of the adjoint problem: 

<(«, qM y) = -{G{G*G)-^^E{q), c\u)m € y (12) 

and p{'){') and /?*(•)(•) the residuals of the state and adjoint equation de- 
fined by: 

piuh){4>) ■■= f{4>) -a{uh,qh)(4>) 

P*{uh,yhm ■= -{GHiGlGf,)-^VEiqH),C'{uH)i4>))-a'^iuh,qh){4>,yh). 

(13) 

The remainder term R {due to linearization) is quadratic in the error and the 
additional remainder term P admits the estimate: 

|P| < c {\\e4v + l|e,||Q + |lA;..;||y + \\S,,z\\v)\\R^^{u)\\z, (14) 

where Cu '•= u — Uh, Sq := q — qn and 5h(j> \= f> — ih(p is an interpolation error 
operator. The mean tangent solution v eV is given by 

Tip 

v = ~^{{G*G)-^VE{q)).w, (15) 

and the normalized adjoint solution z is determined by: 

a'^{u,q){4,,z) = (-^JJ|^,C'(«)(^))z € V, (16) 

if the least squares residual R^^{u) does not vanish; otherwise we set z = 0. 
The constant C does not depend on the mesh parameter h nor on the measure- 
ments C . 

Proof. For proof we refer to [5] . 

For evaluation of the error estimator, denoted by rjh, the local interpolation 
errors y — iny and u — i^u have to be approximated. In our numerical examples, 
we use interpolation of the computed bilinear finite element solutions y^ and 
Uh on the space of biquadratic finite elements on patches of cells. The main 
computational cost for the a posteriori error estimator described above is the 
solution of one auxiliary equation (12). This is cheap, even in comparison with 
only one Gauss-Newton step, which includes solution of the state (nonlinear) 
and of the several (linear) tangent equations. 

In order to illustrate the typical use of the error estimator rjh^ we sketch 
a generic adaptive mesh refinement algorithm. Such an algorithm generates 
a sequence of locally refined meshes and corresponding finite element spaces 
until the estimated error with respect to E is below a given tolerance TOL. 
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Adaptive Mesh Refinement Algorithm 

1. Choose an initial mesh 7^^ and set k = 0 

2. Construct the finite element space 

3. Compute Uh^ G G Q solving (4,5) 

4. Evaluate the a posteriori error estimator 77^^ 

5. If rjhj^ < TOL quit 

6. Refine 7^^ — > using information from 

7 / 1 . 

7. Increment k and go to 2. 



Remark 1. In step 3, the parameter identification problem is solved on a fixed 
mesh. As initial data, we use the values from the computation on the previous 
mesh. This allows us to avoid unnecessary iterations of the optimization loop 
on fine meshes. 



4 Numerical result 

In this section we compare our general approach to mesh refinement for param- 
eter identification problems with some heuristic methods. We consider three 
types of heuristic approaches for mesh refinement: a strategy based only on 
the information obtained from the computed state variable, a strategy based 
only on the a priori knowledge of the structure of the observation operator and 
a strategy, which combines both types of information. 

We consider the following diffusion-reaction equation with unknown coef- 
ficient q in the unit square Q — (0, 1)^: 

—qAu -\- su — 2 in 17, . 

u = {) on 5l7, ^ ^ 

where s is chosen as s = 200. The parameter q is estimated using measurements 
given by the values of the state variable at nine different points see Figure 1. 

The exact value of the parameter is q = 1. 

The components of the corresponding observation operator C have the 
following form: 

Ci{v)=v{ii), (18) 

and the parameter identification problem is formulated as follows: For {u^q) G 
V X Q with V = Hq{Q) and Q = M. 

Minimize | ('“(6) - 

i=l 



( 19 ) 
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Fig. 1. The computational domain with measurement points marked by circles 



under the constraint (17), where Ci denote the components of the measurement 
vector C ^ Z — and are given by the values of the state variable u for the 
exact parameter g, i.e. Ci = 

First, we compare the quality of meshes generated by our a posteriori error 
estimator for this problem with a typical strategy based on a posteriori infor- 
mation obtained by the state variable, i.e with the mesh refinement guided by 
one of the well-known “energy” type error estimators for uncontrolled equa- 
tion, see e.g. Bank & Weiser [2] and Babuska & Miller [1]. This estimator aims 
to control the error u — Uh in iJ ^-norm, but they do not take care of the struc- 
ture of the parameter identification problem. As seen from Figure 2, adaptive 
refinement based on the “energy” estimator leads to a similar reduction of the 
error as global refinement. However, the strategy based on our error estimator 
leads to an obvious saving in the number of unknowns necessary to achieve a 
prescribed accuracy level. 




Fig. 2. Errors in q for different refinement strategies vs. number of nodes (global 
refinement, “energy” -based refinement and refinement resulting from our a posteriori 
error estimator) 
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Next, we compare our strategy for mesh refinement with the following 
heuristic approach: In each iteration of the mesh refinement we refine the 
cells, which lie close to one of the measurement points, i.e. in the set 

[J{a;|||a;-^i|| <r}, (20) 

i 

where r G M+ is a given number. In contrast to our approach, this strategy 
is unable to weight the relative importance of the measurement points. The 
corresponding comparison is done in Figure 3 for two choices of r (r = 0.04 
and r = 0.1). 




Fig. 3. Errors in q for different refinement strategies vs. number of nodes (global 
refinement, refinement across measurement points and refinement resulting from our 
a posteriori error estimator) 



After several steps one does not observe any error reduction despite in- 
creasing the number of nodes for both choices of r in the described strategy, 
as could be expected. Typical meshes resulting from application of our a pos- 
teriori error estimator, “energy” estimator and the last strategy are shown in 
Figure 4. 

We also compare our mesh refinement procedure with a combination of the 
last heuristic methods. By this strategy both the cells marked by the “energy” 
estimator and the cells across the measurement points (20) are refined. The 
corresponding comparison with our mesh refinement procedure is made in 
Figure 5 for two choices of r (r = 0.04 and r = 0.1). 

The typical meshes resulting from application of this strategy for two 
choices of r (r = 0.04 and r = 0.1) are shown in Figure 6. 
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Fig. 4. Typical meshes produced by our a posteriori error estimator (left), energy 
error estimator (right) and the refinement across the measurement points 



5 Conclusions 

We presented an a posteriori error estimator for finite element discretization 
of parameter identification problems. This error estimator is cheap to evaluate 




Fig. 5. Errors in q for different refinement strategies vs. number of nodes (global re- 
finement, refinement produced by combing the refinement across measurement points 
and “energy” -based refinement, and refinement resulting from our a posteriori error 
estimator) 
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Fig. 6. Typical meshes produced by combining the refinement according to “energy” 
error estimator and the refinement across the measurement points for r = 0.1 (left) 
and r = 0.04 (right) 



and assess the error we are interested in, i.e. the error in parameters. We com- 
pared our approach with some heuristic methods with respect to the quality 
of the generated meshes. The presented error estimator is successfully applied 
to parameter identification in CFD problems, see Becker & Vexler [6] and to 
estimation of chemical models in multidimensional reactive flows, see Becker, 
Vexler & Braack [3]. 
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Summary. In this contribution, we present a phase-field model of advected mean- 
curvature fiow and of advected pattern formation in solidification. The model is 
baised on the approach presented in [3], where an extensive literature list on meth- 
ods treating mean- curvature problems can be found. The model represents a step 
towards simulation of solidification processes where the melt motion is important. 
We give a basic mathematical information concerning the weak solution of the model 
equations, introduce a numerical scheme based on the Finite-Difference Method, and 
show several numerical studies demostrating basic qualitative effects of advection in 
the given context. 



Mean curvature flow with advection. The problem of mean curvature 
flow of hypersurfaces (see [7, 6, 5]) is usually set as follows: 

normal velocity = —mean curvature + forcing^ 



in the normal direction to a closed hyper surface F. In this work, we consider 
the law modified by an imposed velocity field V, which means that the hy- 
persurface is advected by the vector field, as well. Using the notations np for 
the Euclidean normal vector to E, t’p for the normal velocity, Kp for the mean 
curvature, and F for the forcing term, we formulate the motion law for F as 
follows 

vp — V • Up = — Kp F. (1) 

The equation (1) has origin in the modified Stefan problem with advection 
describing the solidification of crystalline materials where the bulk mixture of 
liquid and solid is carried by an imposed vector field V : 



du 

dnr , 



dmU 
dt 
du 



= Au in and f2p 
= L{vp — V • Up) on F{t), 



dnr 



F{u) = Kp a{vp — V • Up) on E(t), 



( 2 ) 



where u denotes the temperature field, the temperature of melting point, 
i?s , Qi the solid and liquid subdomains of i7, L the latent heat per unit volume, 
a a material parameter, F{u) a coupling term (= u — u*), and ^ = A q. y .y 
the material derivative. 
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For details of physical context, we refer the reader to [8, 10]. Obviously, 
the above given physical problem simplifies the problem of density driven flow 
around the solidifying structure in a real material. In this general case, the 
law of momentum conservation would be needed, and the boundary condi- 
tions should also correspond to the conservation of quantities in question. Our 
purpose is to study the problem (2) by a diffuse-interface approach which al- 
lows us to design a suitable numerical algorithm and to perform qualitative 
numerical studies. 

Allen-Cahn equation with advection. The law (1) can be treated by 
a particular method of levelset type (see [3]), which relies on solution properties 
of a reaction-diffusion equation of Allen-Cahn type (see [9]). We refer the 
reader to [3], a sample of literature resources on this topic. Introducing the 
thickness ^ > 0 of the diffuse layer surrounding F and the polynomial fo{s) — 
ap(l — p)(p— as the minus derivative of a double- well function icq, we present 

the phase-field approximation of the advected mean curvature flow 
as follows 



ae'^=eAp+Hp)+Fe\^p\. ( 3 ) 

In Figure 1, we illustrate the expected fact, that the ^-levelset of a solution to 
(3) converges to the set F{t) evolved by (1) - see [2] for the no-advection case. 




Fig. 1. Schematic relationship to the original motion law 



Phase-field equations with advection. The above indicated approach can 
be used to approximate (or regularize) the physical problem (2). In this case, 
we obtain a complete system of phase-field equations with advection reading 
as 



dm^ 

dt 



Au + Lx'(p)- 



dt 



= ^Ap + i/o(p) + i^(u)e|Vp|, 



( 4 ) 




On a Phase-Field Model with Advection 



143 



with the initial conditions u |t=o= P |t=o— Pini, and with the homoge- 
neous Dirichlet boundary conditions (set for the sake of simplicity) . Addition- 
ally, we assume that F{u) is a bounded Lipschitz-continuous function. 

The enthalpy of the system H{u) = u — Lx{p) is expressed by means of 
a focusing function x, which is monotone with bounded, Lipschitz-continuous 
derivative: 



x(0) = 0, x(0.5) = 0.5, x(l) = 1, supp{x') C (0, 1). 

In the following theorem, we set x(p) = P for simplicity, although the compu- 
tations are performed for a nontrivial x- The general case is investigated in 
[2]. Considering a bounded domain C C with boundary, we can state 
the following basic property of the system (4): 

Theorem 1. Ifuini^Pini ^ Ho(C), V G Loo(C;R^), and ^ remains fixed, then 
there is a unique solution u,p G L 2 ( 0 , T; HQ(i7)) of the weak problem 

yv,q G V{Q), a.e. in (0,T) : 

^(n - Lp,v) + (V • V{u - Lp),v) + (Vu, Vt’) = 0, (5) 

(^(P,9) + (V • Vp,w)) +f(Vp,yq) = ifoip),q) +i‘^{F{u)\Vp\,q), 
w(0) = Wo, p(0) = po, 
for which 

p e L2(0, T; n Hj(r?)), ^ e L2(0, T; UW)). 

Proof is an extension of the result stated in [2] . We concentrate ourselves on 
issues closely related to the advection terms in both equations. By means of a 
total set in L 2 (C) (e.g., consisting of eigenvectors of —A) denoted as 
we define a finite-dimensional subspace 

Vm = span{vi}i^nm where = [1, . . . , m], 

and consider the projector Vm L 2 (C) Vm- By means of the Faedo-Galerkin 
method, we derive a semi-discrete scheme ^v^q eVm 

{dt{u^ - Lp^), u) + (V • - Lp^), v) + Vv) = 0 a.e. in (0, T), 

a?" 9) + (V • Vp'", v)) + ei^p"^, Vg) (6) 

= ifoip”^),q)+e{Fiu^)\Vp^,q), 

M™(0) = VmUini, P^(0) = VmPini- 
The approximate solution is given by basis functions of Vm 

= E p^{t) = E 
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In fact, the semi- discrete scheme is a system of ODEs for unknown functions 
of time: 7^^, (3'^ . Next step is derivation of energy estimates. We multiply the 
equations by respectively, and sum over i G 

\\dtu^\? + + (V • - Lp^), dtu^) = L{dtp^, dtvT), 

II II ' + y ^ II Vp'” II 2 + . Vp™ , dtp^) 

= Uo{pn,dtPn+e{F{un\Vp'^\,dtp’^). 

Consequently, we obtain the inequalities 

Wdtu^f + y ||Vm'"||" < ZL'^ClWdtp^f + + ZV'^L'^ClWVp^f, 

^-ae\\dtp^\? + y y IIVp™f + y(«^o(p™), 1) < llVp-f. 

where Wq = — /o, \F{u)\ < Cp, V = ||V||oo- They allow us to use the com- 
pactness method in the same manner as it is presented in [2]. Namely, the 
theorem assumptions together with the above estimates processed by the 
Gronwall lemma give that, independently of m, Vp'^ are bounded in 

Loo(0, T; L2(i7)), and p'^ are bounded in Loo(0, T; Ls(i7)) for each finite time 
T > 0, and for any s > 1. Repeated integration says that are 

bounded in L2(0, T; L2(i7)) for each finite time T > 0, independently of m. 

Therefore, we are able to pass to a weak limit u'^ u in L2(0, T; Ho(C)) 
pm p L2(0 ,T;Ho(C) n L4(i?)) via a subsequence m', and additionally, 
thanks to the compact-imbedding theorem with the assumptions 

bounded in L2(0,T;Hj(f?)), {-|^}~=i bounded in 

{P™}™=1 bounded in L4(0, T; n L4(J7)), 

{^}m=i bounded in L2(0, T; L2(r2)), 

to the strong limits u in L2(0, T; L2(i7)), p in L4(0, T; L4(i7)). 

The passage to the limit in the semi-discrete scheme (6) can be accom- 
plished due to the following facts: 

1. Vp^' converges strongly in L2(0, T; L2(C; M^)) to Vp (Lemma 3.4 of [2]), 
Vu'^ converges weakly in L2(0, T; L2(i7; M^)) to Vix, 

2. V • V{u'^ — Lp'^ ) converges weakly in L2(0, T ; L2(C)) to V -V{u — Lp)^ 
V • Vp^' converges weakly in L2(0, T; L2(C)) to V • Vp, 

3- fo{p^ ) converges weakly inL|(0,T;L4(i7)) (polynomial nonlinearity and 
Aubin lemma), 

4. F(ix^)|Vp’^| converges weakly to F (it) I Vp I in L2(0, T; L2(i2)) (Lemma 3.5 
of [2]), 
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converge strongly to pini, Uini in L2(l7), 

6. p^' {{)) = Vm'PinU U^' {0) = Vm'Uini^ 

Additionally, we observe that the function p belongs to L 2 ( 0 , T; Ho(i?)nH^(i7)) 
(Lemma 3.3 of [2]). The weak solution satisfies the initial condition (again see 
[2] for details). 

Due to Lipschitz-continuity of F with the Lipschitz constant denoted by 
Li?, we prove uniqueness of the solution of (5). We consider two solutions 
of the problem (5), denoted by [ui^pi] and [u 2 ^p^. Subtracting corresponding 
systems of equations and denoting [ 1 ^ 12 ,^ 12 ] = ['^1 ~ ~ P^]-, multiplying 

the first equation by 1^12 — Lpi 2 and the second equation by pi 2 , we have 

2 ^H '^12 — + ||V('Ui2 — Lpi2)|p + L(Vpi2, V(^12 — Lpi2)) 

4-(V • V('U 12 - Lpi 2 )),ui 2 - Lpu) = 0 in (0,T), 

+C^||Vpi2f + a^^(V • Vpi2),Pl2) = (MPi) ~ /o(P2),Pl2) 
+^^(F(mi)|Vpi| - F{U2)\VP2\,P12) in ( 0 ,T), 

(«12 - Lpi2)(0) = 0, Pl2(0) = 0. 



Due to the form of /o, we have that 

(/o(Pl) - fo{P 2 ),Pl 2 ) < '^{{Puf, 

Using the Schwarz inequality, we get 

d||ui 2 - Lpuf < V^\\u ,2 - Lp^ 2 \? + i'llVpiaf , 

\ae^^\\pi2f+e\\Vpnf< (7) 

-^||Pl2||^ +C^-C'F|lwi2l|||Vpi||L4(/2)lbl2||L4(X2) 

+ ieCF + aeV)\\Vpi2\\\\pi2\\, 

in (0,T). Due to the Young inequality, we obtain 

d|bi 2 - Lpi 2 f < L^\\ypi 2 f + ^"Ibi 2 - ipi 2 f in (0,T), 

Ic^e^^\\pi2f + j\\^pi2r< (8) 

liCie + y + + 2L2e2^|.C|||Vpi||2^(^))lbi2||^ 

+3^^L|^C||| Vpi||l4(/2) H'^12 ~ Lpi2\\'^. 



Combining these inequalities, we have in (0,T): 
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M{t) (^^ae\\puf + ^\\un-Lpr2f^ 

with 

M{t) = y2+6L2i|c|||Vpil|2^(^)+^+^+3aV-2^^Ll.C|||Vpi||2^^ 

Considering the fact that there is a constant Cp for which 

[ \\^Pi\\l 4(0)^^ — [ \\Pl\\u^{f2)^^ — 

J 0 Jo 

where C 4 is the norm of the imbedding Ho(J7) into L 4 (f?), and 
Pi €L2(0,T;H2(f2)), we have that 

Pi 2 (t) = mi 2(0 = 0 in L2(f2), Vt e (0,T). 

as follows from the Gronwall lemma. □ 

Asymptotic behaviour. A priori estimates in the above given proof imply 
that 



E^[p]{t) < E^\p](0)exp{^t} te(0,T), 

where we denoted 

+ iwo(pj)]da:, 

and additionally, there is an estimate for the time derivative given by 

\\dtpfdt + E^\p]{T) < CtE^\p]{0). 

We therefore can state that the function p = p(t^x;^) tends to a stepwise 
constant function as in the Theorem 2.2 of [4]. As in [2], the matching procedure 
recovers the Gibbs-Thompson law V - nr) = — /^r + F{u^) + as 

well as the Stefan condition at F. 

Remark. As indicated in [1], the model (4) can be modified by incorporating 
anisotropy into coefficients of (1). The above given analysis is applicable again, 
namely for weak anisotropies. 
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Method of lines in 2D. In the computations, we take Q — (0, Li) x (0, L 2 ) 
and use a uniform rectangular grid. The following notations are introduced: 

h = (hi,h2) , hi = /i2 = Uij = u{ihi,jh2), 

iVi iV2 

Wh = {[i/ll, i/ 12 ] I i = 1,. . Ail - 1; i = 1, . . . , Ai2 - 1}, 

Wh = {[i/ll, i/ 12 ] ] i = 0, A^i; j = 0,...,N2}, 7h = - Wh, 



and 



Uiuij = , 



'^X2,ij — T 5 

f^2 



"^xixi,ij — j^2 



^Uij “h . 



^ h'^ — ['^xi 7 "^0:2] 7 ^ 7 ^372] ’ — ^X\X\ H“ '^X 2 X 2 ‘ 

The set of grid functions is denoted by Jih- The semi-discrete scheme has the 
following form 

(^+V•W)(w'^-ix(/)) = ^h«^ 

aeif ^ + V • Vh)p^ = e^hP^ + /o(/) + ev^hp'^mu^) on wh, 

l^h— ^7 kh— ^5 

'u'^(O) = VhUini, P^(0) = VhPini, 

where its solution is a map :< 0,T >— ^ ?-^h and Vh : C(i?) — > Wh is 

a restriction operator. The stability and convergence of the scheme can be 
investigated in a way very similar to the proof of the Theorem 1 . The scheme 
is designed to meet real conditions, where the diffusion and growth process 
dominates the advection. 

Computational results. We present several results of advected curve dy- 
namics and advected pattern formation in solidification. Figure 2 shows how 
the advection field influences particular situations of curve dynamics. In Figure 
2a, the circle of critical radius is carried down, and when it interacts with the 
domain boundary, it is brought to shrinking. Obviously, such circle remains 
unchanged, when no advection is imposed. In Figure 2b, the initial circle is 
converted to an anisotropic pattern due to the anisotropy incorporated into 
the model. When it expands, it interacts with the boundary. In Figure 2c, a 
circle is shrinking when being carried around by advection. In Figure 2d, a cir- 
cle at critical radius is carried around by advection. As there is no interaction 
with boundary, it remains unchanged. 




148 M. Benes 



In Figure 3, we observe a single- dendrite growth. The pattern falls down- 
wards. In Figure 4, three nucleation sites were imposed. The dendrites growth, 
interact and fall downwards, where they touch the domain boundary. 



Circle at critical radius Expansion with anisotropy 




0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 



^ = 0.008, h = 0.022, F = -20.0, dt = 0.00015, ^ = 0.008, h = 0.022, F = -100.0, dt = 0.00015, 

V = (0.0,-1 00.0) V = (0.0,-1 00.0) 

Circle shrinking with rotation Circle at critical radius with rotation 




0 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2 



^ = 0.008, h = 0.022, F = 0.0, dt = 0.00015, 4 = 0.008, h = 0.022, F = -25.0, dt = 0.00015, 

v:1000jc v:1000tc 

Fig. 2. Various situations of the advected mean curvature flow 
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Fig. 3. Solidification of a single falling dendrite - parameters are u* = 1.0, Uini = 
0.0, L = 2.0, p = 300, a = 4.0, a = 3, L\ = L 2 = 3.0, initial radius = 0.025, 
Ni = N2 = 300, C = 0.015 
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Time level t = 0.02700 



Time level t = 0.09000 



Fig. 4. Solidification of three falling dendrites - parameters are u" = 1.0, Uini = 0 
L = 2.0, (3 = 300, a = 4.0, a = 3, Li = L 2 = 3.0, initial radius = 0.025, A^i — N 2 
300, ^ = 0.015 
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Summary. Boundary element formulations for eddy current problems are based on 
non-local operators. Discretizing these operators by standard Galerkin techniques 
leads to large dense matrices. In order to treat the discretized system efficiently, we 
cannot store these dense matrices directly, but use data-sparse approximations. 

We present an approach based on piecewise polynomial interpolation of the un- 
derlying kernel functions. The resulting Id? -matrix approximation can be stored using 
only 0(nm^) units of storage, where n is the number of degrees of freedom and m 
is the order of the interpolation. Construction and evaluation of the approximated 
matrix requires only 0{nm?) operations. 

This paper presents joint work with Jorg Ostrowski. 



1 Introduction 



1.1 Problem 

The eddy-current model introduced in [6] in combination with impedance 
boundary conditions [9] leads to the variational equation 

a(v, u) -6(v, 0) =/(v) for all v G V 

— 6(u, =CW for all -0 G W 

for the unknown vector field u G V and the unknown potential G W, where 
the bilinear forms have the form 



a(v, u)= J j (curlrv(x),curlru(y))^(x,y)dyc^x + sparse, 
4>) = V'(x), curlr </>(y))^(x, y) dy dx, 

= ^^(curlr0(y),v(x))(grad^^(x,y),n(x))(iydx 
- (curlr Hy), n(x))(grad^ ^(x, y), v(x)) dy dx 



irJr 
+ sparse. 
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Here, ^ denotes the singularity function for the Laplace equation, i.e., 

and the surface differential operators are defined by 

curl/^u := (n,curlu), curlrV^ (grad^) x n. 

The sparse parts of the bilinear forms can be treated by standard techniques, 
so we consider only the non-local components of the system. 

1.2 Compression 

Discretizing the non-local operators leads to large densely populated matrices 
that cannot be handled efficiently by standard techniques, therefore we use 
the W^-matrix approximation technique [2] to reduce the storage requirements 
and the complexity of the discretization and the matrix- vector multiplication. 
The application of W^-matrices to the problem of the eddy current model was 
investigated in [7] and [3]. 

We remark that there is a close relationship of W^-matrices to the panel 
clustering technique [5] and the fast multi-pole method for integral operators 
[8, 4]. While the multi-pole technique applies an expansion specially designed 
for the kernel function under investigation in order to reach the, in some sense 
optimal, complexity of 0{n\o^ n), the TY^-matrix technique has a complexity 
of 0(n\o^ n) but can be applied to any asymptotically smooth (cf. (4)) kernel 
function. 



2 Approximation 

2.1 Discretization 



We approximate the surface T by a polyhedron Fh described by a conforming 
triangulation. Its triangles, edges and nodes are denoted by T, S and J\f. 

For each edge e G , we denote the surface edge element basis function (cf. 
[3]) by be and define the finite- dimensional subspace Vh •= span{bg : e G 
of V. For each node v G A/*, the surface nodal basis function is denoted by 
and W/i := spanj-^^; : v G M} is a finite-dimensional subspace of W. 

The standard Galerkin approach with these basis functions leads to 



A -B 
-Q 



= rhs 



defined by 

Ae/ == a(be,b/), ^ q{^v,'^w) and Be^,; = h{be,'^w) 

for e, / G and u, u; G A/*, where all matrices are densely populated. 
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2.2 Simplification 



For b G and 'ip G the functions curl/-b and c\xr\ri^ piecewise 
constant. Therefore we introduce the auxiliary space Xh spanned by piecewise 
constant basis functions xt for t G T and the auxiliary matrix G defined by 






a 



Xt(x)^(x, y)xs{y)dydx 



and observe that 



A == L^GLi + sparse and 




(3) 



hold for sparse matrices Li and L 2 . This representation is more efficient than 
the original one, but it is still based on a densely populated matrix. 



3 Approximation 

3.1 Interpolation 

The idea is to replace the kernel function ^ on a sub-domain r x a C F x F 
by 



U=1 jl=l 



where and are interpolation points and and 

are the corresponding Lagrange polynomials. 

This results in an approximation of the local matrix 



Gif — [ [ Xt(x)^(x,y)x 5 (y)c^yc^x« f j xMW^'"{y^^y)Xs{y)dydyi 

J tJ cr J tJ a 



k k 



= [ Xt(:>d)Cl{x)dx j 

^ yiz. 



Xs{y)Cl{y)dy 



=-^L 



= :V^ 



The approximation requires only 2nk-\-k?‘ units of storage. For typical interpo- 
lation schemes, k will be much smaller than n, so the factorized representation 
will by much more efficient (cf. Figure 1). 

Typical interpolation schemes, e.g., tensor-product Chebyshev interpola- 
tion, work only for smooth functions. Since the function ^ is not globally 
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Fig. 1. Compressed representation of local matrices 



smooth, we cannot hope to find a factorized representation for the entire ma- 
trix G. Instead, we make use of the fact that ^ is asymptotically smooth, i.e., 
that there are constants Casymp,co,d G IR>o such that 

|a“0^^(x,y)| < Casymp4“'+''"(« + /?)!||x-y||-''-l“l-l^' (4) 

holds for X, y G IR^ with x 7^ y. 

Combining this inequality with standard estimates for the interpolation 
error, we find that there is a polynomial Capx such that 






^apx('^) 

dist(E^,S^) 



1 + 



2dist(Jg^,jB^) 

diam(S'^ x B^) 



-(m+l) 



holds for tensor-product Chebyshev interpolation of order m G IN (correspond- 
ing to a rank oi k — [m + 1 )^) on axis-parallel boxes and B^ satisfying 
r C B'^ and cr C B^ . 

To ensure a uniform rate of convergence, the admissibility condition 

diam(B^ X B^) < 2r]dist{B\B^). (5) 

has to hold for a constant 77 G IR>o (cf. Figure 2 ). 




3.2 Cluster tree and block partition 

Since we can apply our interpolation scheme only to sub-domains r x cr satisfy- 
ing the admissibility condition (5), we have to split the entire domain Fh x Fh 
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into sub-domains that either fulfil this condition or are so small that we can 
store the corresponding local matrix block in the standard format. 

We construct the splitting of Fh x Fh using a hierarchical splitting of the 
domain Fh that is called a cluster tree: 

Definition 1. A tree C is called a cluster tree for a set Fh if 

— the set Fh is the root of C, i.e., root(C) = Fh, and 

— if a node r £ C is not a leaf, then it is the {up to sets of measure zero) 
disjoint union of its sons, i.e., 

r = {r' : r' G sons(r)} . 

Each node r is called a cluster. 

A cluster tree can be constructed from an arbitrary set of triangles by binary 
space partitioning: We start with the root cluster containing all the triangles, 
corresponding to the entire domain, split it into two son clusters and repeat 
the procedure recursively until the clusters contain less than k triangles. 
Using the cluster tree, we can construct a partition 

P C {r X (j : r, cr G C} 

of Fh X Fh containing only admissible and small blocks (cf. [1]). 

3.3 Construction of matrices 

The matrix for each cluster r is given by 

VL • - j Xt{x)£l{x)dx. 

Since xt and are polynomials, the entries of the matrix can be computed 
by standard quadrature. 

Since the coefficient matrix is defined by 

for each admissible sub-domain r x a, it can be constructed by simply evalu- 
ating the expression (2). 

Setting 

Tfar :=={tX(jgP : rxais admissible}, Pnear •= P \ Pfar, 
the approximation of the matrix G is defined by 

G:= (6) 

TXaGPfar Pnear 

The approximation (6) requires 0{nklogn) = 0{nm^ logn) units of storage, 
and the approximation error decreases exponentially in m. 
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4 Nested cluster basis 

Now, we aim to use the special structure of the matrices V^, the cluster bases, 
in order to reduce the storage complexity for the matrix G. 



4.1 Nested approximation spaces 



Let r be a cluster and r' one of its sons. Since we use the same order of 
interpolation for all clusters, we have 



C 



T 

V 



■ ly/jy 



which implies ^ 

r'Gsons(r) 



(7) 



SO we need to store the n x fc-matrix only for clusters without sons, since 
we can reconstruct it for all other clusters by using the k x /c-matrices T'^ 
(cf. Figure 3). This reduces the storage complexity to 0{nk) = 0(nm^). 




Fig. 3. Representation of cluster basis matrices by transfer matrices 



4.2 Fast matrix- vector multiplication 

The equation (7) can also be used to perform the matrix- vector multiplication 
efficiently: We introduce the auxiliary variables 



for clusters r, cr G C and find 




Gu= ^ 




(r,0-)GPnear 


(r,0-)GPfar 


= E 


G^'^u + Y 


(t,Ct) G-Pnear 


(r,(r)GPfar 


= E 


G’’-'^u + y^V’'v^ =v. 


('7' GPnear 


r£C 
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For typical domain splittings, the set {cr : r x a e Pfar} contains only a small 
number of elements, so the computation of can be done in complexity 0{nk) 
for all r G C. This leaves us with the task of evaluating and 

efficiently. Due to (7), we have 



u'" = V‘""^U = ^ 

cr'Gsons(cr) 






E 

cr'Gsons(cr) 






SO we can use a recursive procedure to compute the vectors for all a G C 
in 0{nk) operations. A similar recursion can be used to construct 
in 0{nk) operations. 



5 Approximation of the double layer potential 

Now that we know how to store and multiply by the matrices A and Q in 
complexity 0{nk), we only have to handle the matrix B efficiently. 

We replace ^ by the local approximations and grad^, ^ by grad^, 
and recall the sparse lifting matrix L2 from (3) in order to find that 

/V^Tsr,aw^d\ 

B = Lj + Y 

tx<t€P, ar rxa£P„e:M 

is a good approximation of B, where 

= f (bj)^(x)(grad£^(x),n(x)) -n^(x)(grad£^(x),b^(x))dx 
J a 

for j G /i G {1, . . . , A:} and G {1, . . . , 3}. 

For clusters with sons, the matrices can again be expressed by the 
transfer matrices so we can re-use most of the matrices involved in the 

approximation of G and require additional storage only for clusters without 
sons and for the near-field. 



6 Numerical experiment 

We approximate the auxiliary matrix G on the unit sphere using a local inter- 
polation operator of m = 2. We use the admissibility condition (5) for 77 = 2 
and find the results given in Table 1 . We can see that the approximation error 
is stable, while the memory and time requirements grow linearly. 
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Table 1. Approximation of the auxiliary matrix G 



n 


Mem[MB] 


Mem[KB]/n 


Build [s] 


MVM[ms] 


Error 


512 


3.9 


7.68 


1 


4 


2.3-4 


2048 


21.6 


10.54 


5 


46 


4.2-4 


8192 


113.9 


13.90 


18 


269 


4.3-4 


32768 


461.4 


14.08 


79 


1138 


4.2-4 


131072 


2024.4 


15.45 


305 


4990 


— 


524288 


7976.1 


15.21 


1224 


19772 


— 


2097152 


32023.8 


15.27 


4974 


83181 


- 
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Summary. An adaptive method for reactive flows involving locally refined meshes 
and different types of diffusion models is proposed. Starting with a less exact diffusion 
model, the model is changed locally throughout the computational domain to a more 
accurate and much more expensive model. An a posteriori error estimator provides 
reliable information on where to refine the mesh and where to adapt the model. 
Discretization and modeling errors are equilibrated. 



1 Introduction 

The underlying equations describing reactive flows are well-known but may 
involve models of different complexity, scales and accuracy. In various cases, the 
most accurate and validated model cannot be chosen in numerical simulations 
because of the large amount of computational costs. Simpler models usually 
need less computing time and involve less couplings between variables. For 
instance, the choice of diffusion models in gas mixtures is not straightforward. 
Although multicomponent diffusion models are accepted to be accurate [8], 
simpler and less accurate models, e.g. Pick’s law, are widely used in practice 
for two- and three-dimensional simulations. While simple diffusion models, as 
for instance Pick’s law, may involve only diagonal diffusion, multicomponent 
diffusion models leads to couplings between all chemical variables. For implicit 
solvers, these couplings lead to a huge fill-in in the (sparse) Jacobians. Due to 
the resulting high numerical cost of complex models, it is desirable to apply 
the complex diffusion model only in those regions of the computational domain 
where necessary; for instance in the flame front where a complex balance of 
reaction, convection and diffusion phenomena takes place. However, it is not 
a priori known, where an accurate diffusion model is necessary. In this work, 
we present an adaptive method which automatically detects the regions where 
an accurate diffusion model is important. 

In the previous work [6], the mathematical background for a posteriori 
control of modeling errors and discretization errors is given. Other work ad- 
dressing the estimation of modeling error includes [9, 13, 14]. For measuring 
the modeling error the variational formulation of the partial differential equa- 
tion together with a duality argument is used. The a posteriori error estimator 
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for the discretization error is obtained by using Galerkin orthogonality of a 
finite element discretization [3]. These two aspects allow to measure the over- 
all error in terms of user-defined output functionals of the numerical solution. 
Both types of adaptivity (mesh-size and diffusion model) are merged in a such 
a way that both sources of errors are equilibrated and subsequently reduced. 
This strategy avoids the situation that the accuracy of the solution is affected 
by a poor diffusion model although very fine meshes are used, and vice-versa. 

As an extension of the theoretical fundamentals in [6] , here we present the 
application of mesh size adaptation and diffusion model adaptation and show 
numerical results for a combustion problem with two types of diffusion models. 

We use the following notations: The usual scalar-product in a subdo- 
main CO C will be denoted by (-,-)cv By (u,v) = (u,v)f 2 we denote the 
integration over the entire computational domain i? C JR^,d = 2,3. We de- 
note the velocity by v, pressure by p, temperature by T, the species mass 
fractions by y/c, k = 1, ... ,ris, and density by p. 



2 Variational formulation of the underlying equations 



We start from the basic equations in variational formulation for steady-state 
reactive viscous fiow describing the conservation of mass, momentum, energy 
and species mass fractions. For this we assemble all variables in the vector 
u = {p,v,T,yi, . . . ,ys) which is an element of a functional Hilbert space V . 
For test functions 0 = (^, -0, cr, tq, . . . Ts) G V, s := Us — 1, the following 
nonlinear forms are used: 

ai{u){4>) := (div(pv),^), 

a2{u){4>) := {p{v ■ V)v,iIj) + {TT,Vtp) - {p,div tp) , 

ris 

as{u){(f)) := {pcpv • VT, a) + (AVT, Vcr) + '^{hkmkCOk.cr ) , 

k=l 



a^{u){(f)) \= • Vyk.Tk) - {mkCOk.Tk)} , 

k=i 



4 

a{u){(!)) \= Y^ai{u){(l)) . 

i=i 

The mass fraction of the last species is set to y^^ — 1 — yi , in order 
to ensure that the sum over all yk is equal to 1. The density is considered as 
a coefficient determined by the gas law 



_ pm _ _ / 

RT' ^ ( X. 



\k=l 



Vk 

rrik 



( 1 ) 



with the universal gas constant R and the mean molar weight m. Coefficients 
are the viscosity p, the heat capacity Cp at constant pressure, and the head 
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conductivity A. For each species /c, we have the molecular weight mjt, the 
specific enthalpy h/j, and the molecular production rate tok. The viscous tensor 
7T is given as usual for compressible Newtonian fluids. For ease of presentation, 
in the form a^{u){(j)) the effect of temperature flux due to diffusion fluxes of 
species with different specific heat capacities is neglected. 

The form a^{u)[(f)) does not yet include species diffusion. We consider the 
following two models for the mass diffusion fluxes Th'- 

Pick’s law: diagonal diffusion driven by the gradient of mole fractions X]^ = 
ykfn/mk, 

= -pDl^Vxk . ( 2 ) 

The diffusion coefficients Dl = Dl{y) are given by an empirical law which 
is about 10% accurate, see [12]. 

Multicomponent diffusion: a full diffusion matrix driven by the gradients 
of species mole fractions: 

= -pVk I £ DkiWxi + 0k^{log T) I . (3) 

The diffusion coefficients Dki are given only implicit as solutions of linear 
systems. Therefore, the computational costs are higher as for the previous 
diffusion model. However, this form of diffusion flux can be derived by the 
theory of gases, see [8, 10]. 

Recent DNS computations [7] investigate the differences of these two models 
and show a legibly impact of model (ii) especially for lean and rich hydrogen 
flames. 

Partial integration of the multicomponent model (3) leads to the semi-linear 
form 

k=l 

The residual of the problem will be denoted by 

Q{u){(f)) := (/, (p) - a{u){4>) - d(u){4>) . 

Now, the solution u fulfills the equation 

g(u){(j))—{) V(/) G , 

where non-homogeneous Dirichlet conditions for u are described by u. Homo- 
geneous Dirichlet conditions are already included in the choice of the space 

y. 



3 Discretization 

The discrete counterpart of the equations is the basis for the a posteriori error 
estimator we need later for the adaptive procedure. 
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In order to switch locally between the diffusion models (2) and (3), we 
introduce a (symbolic) parameter m and a subdomain Qrn of In we 
use multicomponent diffusion, and in Qf := Q\ Qrn the simpler Pick’s law. 
We define the diffusion part 

s 

dmium + 

k=0 

and formulate a perturbed solution Um G V solving 

a{Um){(l>) + djn{Um){<P) = if,4>) '^(f> G V . 

The way to determine the subdomain Qrn will be explained later on. 

The discretization is done by conforming finite elements on a triangulation 
Th oi Q. We denote the corresponding space by VJi C In order to handle 
the convective terms and the stiff pressure-velocity coupling one should add 
stabilization terms to the discrete systems. Such stabilization can be 

done in different ways. We use the stabilization concept for the u,p-coupling 
proposed in [1] and for the convective terms in [11] and [2]. To this purpose, we 
need certain restrictions on the meshes used. We assume that the triangulation 
Th is organized patch- wise: Th results from a global refinement of a mesh 72^. 
Note that Th contains in two dimensions twice as much hanging nodes as 72^. 
The same construction is possible in three dimensions. 

By ^2h • y 2 h we denote the nodal interpolation to the coarse grid, and 

hy TTh i — i 2 h '' y^h yh the projection operator which filters the small-scale 
fluctuations The stabilization form reads now: 

S{u){(f)) = {VTThP.SpVTThO + • 'y)7ThV,Sy{f3y • V)'Kh'il^) 

s 

+{Pt ■ Vtt/jT, StPt ■ VTTftd) + • Vnhyk, SkPv ■ , 

k=\ 

with f3y pu, Pt := {pCpV + a) and piece- wise constant functions 

(^ 1 , • • • 5 depending on the mesh-size h. For further details, we refer 

to [2]. 

This leads us to the definition of the discrete residual 

Qhm{u){4>) := (/, </>) - a{u){4>) - dm{u){(f>) - Sh{u){4i) . 

The reduced discrete system to be solved reads 

^hm G — d '^0 G . 

The difference between the continuous and the reduced discrete residual is 

9hm{u){(f)) := g{u){4>) - Qhm{u){4>) 

= -d{u){<f>) + dm{u){(f>) + Sh{u){(f)) 

= , VTfc)n, + Sf,{u)id>) . 

k=0 

This is the difference of the two diffusion models in the domain f?/, where 
Pick’s law is used, and the additional stabilization terms, which are usually 
small. Since, in this work we do not focus on adaptively chosen stabilization 
terms, we neglect the contribution Sh{u){(j)) in the expression ghm{'^){4^)^ 
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4 A posteriori control 

For the local refinement of the mesh and the adaptivity respect to the diffusion 
model, we need an a-posteriori error estimator which gives us information 
of the two error contributions. We aim to measure the error respect to an 
arbitrary output functional j : V M. We formulate the following dual 
residuals 

g*{u){z, 4>) := j{4>) - a'{u){4>, z) - d'{u){4>, z ) , 

Qhm{u){z, (f>) := j{4>) - a'{u){4>, z) - z) - z) . 

We use the dual solution z G V to capture the influence of the error to the 
functional: 

zeV: g%u){z,(t)) -0 

The corresponding reduced discrete version reads: 

^hm C Vh • Qhrn^^hrn^i^^hrri’) .= 0 V(^ G 

The errors will be denoted by — Uhm and Cz = z — Zhm- We recall the 

error representation in [6] wherein a proof is also given. 

Theorem 1. If the semi-linear forms a(it)(*), d(u)(-), Sh{u)((l)) and the func- 
tional j{u) are sufficiently differentiable with respect to u, then it holds 

j('^) j(d^hm) — Qhrnid^hrn^i^^hrrC) 

where i^ : V ^ Vh is an arbitrary interpolation operator and a remainder R 
which is cubic in the error e— {e^^, e^}. 

For the specific form of the remainder R and a proof of this Theorem, we refer 
to [6]. 

The error representation of the Theorem stated above cannot be directly 
used numerically, because it involves the unknown primal and dual solutions u 
and z, respectively. However, the first term ghmf^hm){zhm) can be easily com- 
puted, because it depends only on the reduced discrete solutions Uhm and Zhm- 
Furthermore, the terms ghm{uhm){ez) and g'^^(uhm){eu, Zhm) are quadratic in 
the modeling error since they involve both, e as an argument and the difference 
of the two diffusion models in the expression ghm- We neglect these contribu- 
tions in the estimator. However, if more accuracy of the estimator is required, 
these terms can also be approximated. Evaluation of the remainder R is not 
worthwhile, because it is cubic in the error. 

The terms Qhm{uhm){z - ihz) and gl^{uhm, Zhm){u ~ ihu) describe the 
discretization error and have to be approximated by a numerical evaluation 
of the interpolation errors z — ihZ and u — ihU. An efficient possibility to 
do this, is the recovery process of the computed quantities by higher-order 
polynomials, see [4]. For instance, in the case of quadrilaterals and piecewise 
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bilinear elements (so called Q\ elements), the interpolation can be done on 
biquadratic elements. Let : Vh — > be the quadratic interpolation of 
piecewise bilinears on Th onto biquadratic finite elements on T 2 h- 

The interpolation errors of 2 ;, for instance, will be numerically approximated 

by 

.( 2 ) 

Z IfiZ ~ ^ 2 ^ ^hm ^hm • 

Taking into account that the residuals Qhm{uhm){(l>) and Zhm){(l>) 

with respect to a discrete test function (f) G Vh vanish, leads to the following 
estimator rj consisting of two parts 

j{u) - j{Uhm) ^ V •= Vh-^r]m, (4) 

1 ( 0 \ 

Vh 2^^hrn{'^hrn){'^2h ^hrn) + Qhmi^hrn^ ^hrn){^2h^hrn)} 5 (^) 

Vm 9hm{'^hm){^hm) • (b) 

The part rjh of the estimator can be considered as contributions of the dis- 
cretization, and the part rjm measures the influence of the model. For multi- 
component diffusion, the evaluation of rjm is expensive. However, the gain of an 
adaptive algorithm with local model modification becomes substantial, since 
we do not need to include the global detailed model neither in each residual 
evaluation nor in the Jacobian for solving the equations. 

On the basis of the estimators rjh and rjm^ local error indicators can be 
derived. A standard method is partial integration of the diffusive parts and 
the application of Cauchy- Schwarz on each element. We proceed in a different 
way by filtering coarse grid contributions. We obtain nodal quantities rjh.i and 
rjm,i for each node Mi of the mesh: 

n 

\V\ < + Vm,i) ■ 

i=l 

A proof is given in [6]. Further details, especially concerning the adaptation 
strategy for equilibrating both error contributions are given in [5]. 



5 Computation of an ozone flame 

We investigate the methodology presented above for an ozone decomposition 
flame in a two-dimensional geometry with a moderate impact of the diffusion 
model. Moreover, the complexity of the problem size is moderate enough to 
obtain a reference solution on a very fine mesh. This reference solution is used 
for validation of the error estimators. A more involved example can be found 
in [5], where an hydrogen flame is investigated. 

The flame under consideration is modeled with three chemical species, 
namely 0s,02, and 0-atoms. The reaction mechanism consists of six reac- 
tions, see [3]. At the inflow (x = 0), we prescribe 20% mass fraction for ozone 
and 80% mass fraction for oxygen. Furthermore, the inflow velocity profile is 
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parabolic with maximum velocity v = 35 cm/s. The computational domain 
corresponds to a Cartesian tube, Q = [0,2 cm] x [0,5 mm]. The temperature 
is fixed at the inflow, bottom, and top boundary according to 

T = Train + ^T • exp{-C7(x - Xq)^} , 

with xq = 5 mm, a = 10^, T^^^ = 298 Ff, and AT = 502 JT. A homogeneous 
Neumann condition is imposed at the outflow boundary. 

Figure 1 shows the computed profile of the mass fractions of O-atoms 
when the multicomponent diffusion model is used over the whole computa- 
tional domain. For this type of flame, Fick’s law is a good approximation of 
the multicomponent diffusion model. Therefore, the difference between both 
diffusion models is small, and the computation with Fick’s law yields very 
similar pictures. However, the difference in the model is in the same range as 
the discretization error on the coarsest mesh. 

Now, we report on the adaptive algorithm described before. The initial 
mesh is an equidistant mesh with 585 nodes. We start with Fick’s law (2) 
for the species mass diffusion fluxes over the whole computational domain. 
After computing the stationary solution Uh on this mesh and with the crude 
diffusion model, we compute the associate dual solution for the functional 
j(u) = c f^yodx , with a constant scaling factor c. This functional gives the 
mean value of mass fractions of O-atoms in the computational domain f2. The 
error indicators r]h and rjm are obtained from (5) and (6). On the basis of rjh we 
change the mesh-size locally by bisection of element edges. According to rjm, 
we change the diffusion model locally from Fick’s diffusion to multicomponent 
diffusion. 

This procedure is iterated various times. Results are listed in Table 1. The 
first column contains the number of nodes of the mesh, the second column 
contains the relative amount of cells (in percent) where the multicomponent 
diffusion model is used. The two parts of the estimator rjh and rjm are listed 
in columns 3 and 4, respectively. The sum of these two terms yields the error 
estimator rj (column 5). The effectivity index Jeff = Tj/j{u — Uhm) is listed 
in the last column and is obtained by using the reference solution computed 
on a very fine mesh and with the accurate diffusion model. For an exact error 
estimator, the efficiency index would be equal to 1. Our values are in the range 
of 1.5 — 2.6, which means that the estimator rj slightly overestimates the error. 




166 M. Braack, A. Era 



Table 1. Results for the ozone flame: number of nodes (#nodes); fraction of cells 
flagged for multicomponent diffusion (% of multi.); estimator of the discretization 
error r]h] estimator of the model error r]m] their sum 77 ; the true error j{u — Uhm)] 
the effectivity index Jeff 



#nodes 


% of multi. 


Vh 


77 m 


V 


j(u Uhm) 


hf[ 


585 


0 


2.168 


1.043 


3.210 


2.031 


1.58 


1047 


21.1 


1.250 


9.953e-2 


1.350 


5.385e-l 


2.51 


2 085 


37.4 


1.584e-l 


7.729e-2 


2.356e-l 


1.378e-l 


1.71 


4871 


48.9 


7.488e-2 


3.830e-2 


1.132e-l 


5.351e-2 


2.12 


12 421 


52.4 


5.605e-2 


1.602e-2 


7.206e-2 


3.186e-2 


2.26 


30 013 


66.3 


5.029e-2 


9.372e-3 


5.966e-2 


2.757e-2 


2.16 


81021 


79.4 


2.160e-2 


6.017e-3 


2.761e-2 


1.065e-2 


2.59 




# nodes 



Fig. 2. Ozone flame: estimator r)m for the modeling error; estimator r)h for the 
discretization error; their sum 77; true error j{u — Uh) as a function of the number of 
nodes (mesh points) 



However, the error estimator is reliable since it provides an upper bound for 
the actual error. 

The adaptive algorithm balances both types of errors by adapting the mesh- 
size and the model. Figure 2 illustrates how the two sources of error are equi- 
librated. We plot the estimators rjh and rjm’, their sum 77 , and the true error 
j{u — Uhm) as a function of the number of mesh nodes. The estimator and the 
true error clearly show the same asymptotic behavior. 

The sequence of locally refined meshes with 1047, 2 085, and 4 871 nodes 
is shown in Figure 3. The darker areas indicate the part of the computational 
domain where the multicomponent diffusion model is used. In the remaining 
(light) part, the simple Fick law is used. We observe that the estimator de- 
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Fig. 3. On the upper half of each picture, the areas where multicomponent diffusion 
is used (dark/red areas) and where Pick’s law is used (light areas) are indicated; the 
lower half shows the corresponding locally refined mesh 



tects quite well the reaction area where a difference in both diffusion models 
influences the accuracy of the output functional j {u ) . 
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Summary. In this paper we derive a strengthened Cauchy-Schwarz inequality that 
enables us to formulate a short and transparant proof of the coercivity of a Least 
Squares Mixed Finite Element bilinear form. Also, it shows that the coupling between 
Hq[Q) and iJ(div; Q) is weak enough to be neglected. This results in an alternative 
way to compute approximations of both the scalar variable and its gradient for second 
order elliptic problems. 



1 Least squares mixed finite elements 

We consider the application of the least-squares mixed finite element method 
to the following second order elliptic problem. Let 17 be a bounded domain of 
arbitrary dimension with Lipschitz continuous boundary. Given / C iJ“^(i7), 
find u G i^o(f7) such that 

— div (AVu) = / in J7, u = 0 on 5i7, (1) 

where A is uniformly symmetric positive definite with Lipschitz continuous 
coefficients and all eigenvalues in the interval [/?^,/?“^] for some j3 G (0,1]. 
The least-squares mixed finite element method first writes the second order 
differential equation as a system of two first order equations, 

p == —AVu in i7, divp = / in 17. (2) 

Denoting the inner product and norm by (*, *)o I ' |o, the following 
quadratic functional J : x H(div ; 17) — > M 

J{v,q) = |/-divq|^ + (q + AV'U,^“^(q + AVv))o, (3) 

is minimized over suitable subspaces Vh, x Th. C Hq^O) x H(div; 17 ). Setting 
the first variation in (3) to zero leads to the following discrete problem to solve, 



^{Vh,qh) &VhX Th, B{uh,Ph\vh,qh) = (/,divq/i)o, 



( 4 ) 
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where the bilinear form B : [i^o (i?) x H(div ; f2)] x [iJg (f?) x H(div ; Q)] R 
is defined by 

= (divr, divq)o + {v AVw, A~^ AVv))q. (5) 



It was proved in [4] that B is continuous and coercive. The proof for coercivity 
was in our opinion somewhat tedious and may also be derived as a simple 
corollary of the following lemma. This lemma, which may be called a strength- 
ened Cauchy- Schwarz inequality, may also serve useful in a different context. 

Firstly, define norms on H(div ; i?) and and on their product space by 

llq|ldiv,A = q)o + Idiv q|g, \v\\^ = {AVv, Vw)o, ( 6 ) 

ll(^>q)llixdiv,A = H\,a + llq|ldiv,A- (7) 

li A — I ^ they reduce to the usual norms on those spaces. They are equivalent 
to them in case A ^ I. In particular, we will explicitly use that for all v G 

(8) 

Lemma 1. For all q e H(div ; J?) and v G Hq{Q) we have 

|(q, V^')o| < 7 l^'|l,Al|q||div,A ( 9 ) 



where 



0 < 7 = —p=^======= < 1 with 0 < d = sup . 

Vd^TW o^veHiiO) l^li 



Proof. Let q G H(div; i?) and v G Hq{{2) be given. Then 



l(q,Vt')o| = |(^ < |v|i,A\/(^“^q>q)o- 



( 10 ) 



( 11 ) 



By definition of d and in view of (8), we obtain by using Green’s formula that 

|(q,Vw)o| = |(divq,i;)o| < |v|o|divq|o < ^|v|i,^|divq|o. (12) 

Multiply (11) by d/(i and square the result, and add it to the square of (12). 
This gives 

(1 + I2) l(qVt’)o|^ < ((^"^q,q)o + |divq|^) = ^2 l^li,Al|q|ldiv,A- 

(13) 

This proves the statement. 0 

Remark 1. Instead of using |u|o < d\v\i < d/(5\v\i^A as in (12), we might also 
have chosen to define just one single constant r] giving rise to a possibly smaller 
constant 7 in (9) as follows, 

0 < 7 = ^ with 0 < T] = sup -pp — . (14) 

V7y2 + 1 Mia 
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Remark 2. If we define 1| • ||i^^ = | • lo + I • Ii,A’ following the same lines 
as in the proof of Lemma 1 we get, 

|(q,Vv)o| < ^||t^||i,A||q||div,A- (15) 

This seems a more symmetric and in a sense even stronger result. However, 
the norm || • ||i,^ is too strong for the applications to be discussed below. 

Corollary 1. With 7 as in ( 10 ), for all non-zero {v, q) G Hq (17) x H(div ; 17) 
we have 

B(v,q\v,q) > (1 - q)llixdiv,A > 0 - (16) 

Proof. By definition of B we find 

B(v,q\v,q) = ||(v, q)||Ldiv,A+2(q, V^')o > ||(v, q)||Ldiv,yi-2|(q, Vv)o|- (17) 
Lemma 1 and the inequality 2 |a 6 | < -f give that 

2|(q,Vv)o| < 27 |u|i,Al|q||div,A <7ll(^^,q)ll?xdiv,A• (is) 

Since 7 < 1, the statement follows. <0 

Remark 3. Using Lemma 1 , it can also easily be shown that B is continuous 
in the sense that 

|B(u;,r;v,q)| < (1 + 7)ll(^", r)||ixdiv,A||(t^, q)llixdiv,A, 

which, in view of the constant for the coercivity in Corollary 1 , is an appealing 
result. 

The continuity and coercivity of B give that the Lax-Milgram lemma assures 
the existence of a unique pair (uh,Ph) ^ x F/i of approximations of (u,p). 
Cea’s lemma now implies quasi-optimality in the sense that for all (f/i,q^) G 
X H(div;f2), 

1 + 7 

||(M-M/j,p-p/j)||ixdiv,4 < ||('«-'y/j,p-qft)||ixdiv,/l- (19) 

1-7 

From this perspective, it may seem as if it is not wise to employ spaces Vh and 
which have different approximation qualities, because the approximation 
quality of the product space is not better than that of the worst of the two. 
However, it can be shown that under rather weak conditions, there is still a 
certain amount of independence present between the two approximations. In 
terms of the linear algebraic problem that arises from (4) this means that the 
two diagonal blocks of the system matrix, which correspond to the interactions 
of Vh with itself and of F/^ with itself, are only weakly coupled by the two off- 
diagonal blocks, which represent the interactions between Vh and F/^. In fact. 
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this coupling is so weak that solving just two well-chosen linear systems with 
the diagonal blocks only, leads to approximations and for which 

II {^'^h '^h'> P/i) II lxdiv,A ^ ^^11 P Ph) II lxdiv,A- 

This defines an approximation method for (ti, p) in Vh x that gives ap- 
proximations that are superclose to the least-squares mixed approximations 
{uh^Ph)^ but which is computationally much more efficient. 

2 Solving the system of linear equations 

Let be a basis for 14 and (qj)^i a basis for Define matrices 

S = (sij), C = {cij)^ D = {dij) and G = {gij) by means of their entries 

Sij = {AVi.Vvj), Cij = ('i;i,divq^-), dij (A“^qi,qj), gij ^ (div q^ , div q^ ) . 

Let /n C have entries (/, divq^), and /m C R^ entries (/, fj). Define 
Pat € R^ and um,Um C R-^ as the solutions of the systems 

( C* S') (um) Su%j = fM- (21) 

Let el, be the row of the k x k identity matrix, then 

N M 

Ph = X^(e]vPw)qj- e and uh = '^{eMUM)vj € Vh (22) 

3 = 1 3 = 1 

are the solution of (4), whereas 

M 
3 = ^ 

yields the standard finite element approximation of u in Vh. 

Now, given an approximation of um^ we may substitute it in the first 
(block)-equation of (21) to obtain an approximation p^ of piv from 

{D + G)p% = fN-Cu%. (23) 

Write and p^ for the finite element functions corresponding to the vectors 
and p 5^, respectively. Then the counterpart of (23) in terms of the finite 
element spaces is to write out (4) and substitute for Uh: resulting in 

(^“^P°,q/i) + (divp^,divq;,) = (/,divq/>) + (w5(,divqA). (24) 



Notice that 
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(A ^qh,qh) + (divqft,divq,,) = ||qfcl|div,^- (25) 

hence D + G is invertible and the approximation well-defined. It can be 
used to compute a hopefully better approximation u]^ of um by substituting 
into the second (block)-equation in (21), and to solve Su\[ — —C*p^. If 
we continue like this, we get the Block Gauss- Seidel iteration for the linear 
system (21) as follows. 



given iterate 



1 



(26) 



If we write uj^ and p;^ for the finite element functions that correspond to the 
vectors and p^, respectively, this iteration reads as 



given iterate 



(^ + (divp]^,divq/,) = (/,divq/,) -f «,divq/,), 

Vwft) = -(wh,divp^). 



(27) 



We will now study its convergence. 

Corollary 2. With 7 as in (10), the iterates {u^h-,pD defined in (27) satisfy 
'f~^\\uh - <■^7, A < IIP/v - P^lldiv.A < lluh - (28) 



Proof. Subtracting (27) from the least-squares mixed discrete equations (4) 
and testing with q_h — Ph — pi, '^h — Uh — results in 

\\Ph - Ph fdw,A = i'^h - < , div (ph - p^)) (29) 

and 

\uh - - <‘^\div(p;> - p^)). (30) 

Applying Green’s formula and Lemma 1 gives the statement. 0 

This proves convergence of the Block Gauss-Seidel iteration (21) with a con- 
vergence factor that is independent of the dimension of the subspace. In each 
step of the Block Gauss-Seidel iteration, a system with S and one with D-\-G 
should be solved. Even though for many finite element spaces this can be 
done in optimal computational complexity with MultiGrid solvers [1], there is 
no need to do so more than once if the right start vector is chosen. Indeed, 
choosing the standard finite element approximation of the problem 

defined via the linear system = /m, and computing p^ is already suffi- 
cient in most situations. This procedure costs only one solve with S and one 
with D G. 



Theorem 1. Let = uf^ be the standard finite element approximation of 
(1) resulting from solving the linear system Su\^ = fu- Solve p^ from {D + 
G)p^ = /at — Guj^. Then, under the assumtion that Vh and Fh satisfy 
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— (Th^divTh) is a Babuska Brezzi stable mixed finite element pair, 

— For all f G the solution u of (1) satisfies \\u \\2 < C\f\o, 

-Vve n 3vh e v^, \v - Vh\i < Ch\v\ 2 , 

- Vq S 3qh€Th, Iq - q„|o < C/i|q|i, 

- VqG 3q/i e T/,, jdivq - divq/i|o < Ch\q\ 2 , 

we have that 

II '^h') P/i) II lxdiv,A ^ O/lll (it Uh, P P/i) II lxdiv,A- 

Proof. The assumptions above imply that \uh — uf^\i < Ch\\{u — it/i,p — 
P/i)||ixdiv5 which is a super closeness result proved in [2]. By equivalence of the 
norms defined in terms of A with their counterparts defined by taking A — I, 
we may switch to A norms. Corollary 2 then shows that \\ph — p^||div,A shares 
the same upper bound. Hence, the statement follows. <0 



3 Conclusions 

The above shows that under rather weak conditions, are higher order 

perturbations of the least-squares mixed finite element solutions Uh^Ph- Since 
the computation of i^^,p^ requires only one solve with S and one with D + 
G, they are much cheaper to compute than Uh,Ph^ whereas they cannot be 
distinguished from one another. Hence, it does not make sense to apply the 
least-squares mixed method under these conditions. 

In fact, instead of putting energy into solving for p^, it is also possible^ to 
construct yet another approximation for ph in Th by means of a simple local 
post-processing uf^. It is however not clear when or if such a postprocessed 
approximation will also be a higher order perturbation of p/i, as p^ is. 
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Summary. The limit analysis problem (LAP) for estimation of electric durability 
for a dielectric in a powerful electric field is examined. The appropriate dual problem 
is formulated. After the standard piecewise linear continuous finite-element approxi- 
mation the dual LAP is transformed into the problem of mathematical programming 
with linear equality constraints. This finite dimension problem is effectively solved 
by the standard method of gradient projection. 



1 Introduction 

Investigation of the electrostatic boundary-value problems (BVPs) for di- 
electrics in a powerful electric field is of particular interest in both theory 
and practice. The current research is motivated by significance and practical 
interests in Electrical Engineering and Microelectronics. 

The electric state of a medium in a given domain 17 C is characterized by 
the bulk and surface density of charges and by vectors of electric field intensity 
E = {EJGM^ electric induction D = {D^} ^ electric current density 

J = { J^} G Vector D is introduced by the relation D == £:qE -f P, where 
Eq ^ 8.85-10”^^ is the electric permittivity of a vacuum and P G is the 
vector of polarization density [11, 14, 16]. For the electric field intensity the 
electrostatic potential u is introduced such that E(x) = — Via(x) for almost 
every x G 17, where V — d/di^ is the differential vector-operator. 

In weak electric fields the conductivity current in a dielectrical medium 
is practically absent and the simplest linear constitutive relation E D is 
used [11, 14, 16]. As a result, for the solution of the appropriate linear BVPs 
various effective analytical and numerical methods have been worked out, for 
example, in [15]. 

The classic method of estimation of puncture conditions is based on the 
point criteria. Namely, it is assumed that the electric puncture sets in when 
max{|E(x)| : x G 17} > Eq, where E is the solution of the linear electrostatic 
BVP and Eq > 0 is the critical value, which is measured in physical experi- 
ments on thin plates in a homogeneous electric field [15, 16]. Unfortunately, 
for dielectrics with a complex shape in nonhomogeneous electric fields this 
method introduces a large error. 

In powerful electric fields the essentially nonlinear phenomena of polariza- 
tion saturation (|P| < P* < +oo) and powerful growth of the electric current 
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(^|J|/^|E| >> So) must be taken into account [9, 11]. As a result, the com- 
plimentary physical parameter of the dielectrical medium A > 0 always exists 
such that |D — J| < A < oo. 

Within the framework of the variational method, the existence of the limit 
electrostatic load (such external charges with no solution of the electrostatic 
BVP) was pointed out in [6, 7]. From the physical point of view this effect is 
treated as a loss of electrostatic balance, i.e. as the beginning of the electric 
puncture of dielectric. For calculation of the limit electrostatic load the original 
variational limit analysis problem (LAP) was formulated. From the mathemat- 
ical point of view this problem needs a relaxation because its solution belongs 
to the space BV{n) of scalar functions with bounded variations having the 
generalized gradient as the bounded Radon’s measure [1, 13, 17]. 

Unfortunately no clear physical interpretation can be provided for the fully 
relaxed LAP. Therefore, the original partial relaxation of LAP was proposed 
in [6, 7]. This relaxation is based on the special discontinuous finite-element 
approximation (FEA), which was proposed earlier by the author for LAP in 
non-linear elasticity [4, 5]. But after relaxation the appropriate finite dimen- 
sional problem becomes ill-conditioned and thus needs special preconditioned 
numerical methods as, for example, presented by the author in [3]. 

In this paper the dual LAP in electrostatics is formulated. It has a clear 
physical interpretation and after the standard piecewise linear continuous FEA 
is transformed into a problem of mathematical programming with linear equal- 
ity constraints. This finite dimensional problem is effectively solved by the 
standard method of gradient projection, which is easily adapted for parallel 
computations. 

The numerical results show that the proposed limit analysis method has 
a qualitative advantage over the classic technique of estimation of puncture 
conditions. This method can be used in Electrical Engineering and Microelec- 
tronics. 



2 LAP in Electrostatics 

Let a dielectrical medium occupy a domain 1? C The polarization and 
ionization properties of the dielectrical medium are described by two constitu- 
tive relations D = D(x, E) : i? x and J = J (x,E) : i? X R^ ^ R^, as 

shown, for example in [11, 14, 16]. In practice the relations Di = Sij (x,E)£,- 
and Ji = cTi^(x, E)jEj are used, where {Sij} and {(Jij] are the symmetric 
second-order tensors of dielectric permittivity and conductivity, respectively. 
Here and in what follows over repeated indices the summation rule applies. 
For an isotropic medium £ij — s6ij and = crdij^ where — £:(x, |E|) and 
cr = cr(x, |E|) are scalar functions and Sij is the Kronecker symbol. For a ho- 
mogeneous medium {sij} , {c^ij} = const{x). 

Let the following quasi-static electric influences act on the dielectric: a bulk 
charge with density p in i?, a surface charge with density ^ on a portion 
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of the boundary, and a portion of the boundary is grounded, i.e. w = 0 on 
. Here U n = 0 and |jT^| > 0. Point charges are absent. 

In electrostatics it is assumed that an external source of the electric field 
compensates the work of the electric current in the dielectric. In accordance 
with the classical Thomson principle the true electrostatic potential is the 
solution of the following variational problem: 



— arg inf { n{u) — A(u) : u ^ 



( 1 ) 




^(x, Vu(x)) dQ, 



A{u) = J gu dQ -f J 

Q n 



gu dy. 



where V — {u : Q ^ R; u(x.) =0, x G is the set of admissible electro- 
static potentials, ^ is the specific and n{u) is the full potential energy of the 
electric field, A{u) is the work of an external source on a transference of charges 
from infinity to i7. 

In compliance with the Thompson and Joule-Lenz laws the function 
^(x, E) is calculated as 



1 

<?(x,E)=Ei j [a(x,pE) - Ji(x,pE)j dp. 
0 



In Fig. 1 experimental constitutive relations |E| |D| (line 1) and |E| 

|J| (line 2) for real isotropic dielectrical media are presented [9, 11, 16]. The 
appropriate function of effective induction is shown by the line 3. It is easily 
seen that for every dielectrical medium the scalar A > 0 always exists such 
that for every E G and almost every x G i? the following estimation is true: 



^(x,E)<A(x)|E|, 



( 2 ) 



where 



A(x)=max| D(x, E) — J(x, E) 



E e I . 



From the physical point of view A is the electric saturation. In what follows, 
we consider a homogeneous dielectrical medium for which A — const (x). 

From the estimation (2) it follows [12, 17] that the set of admissible elec- 
trostatic potentials is defined as the following subspace 



V = {ue : u(x) = 0, X G r^} . (3) 



From the mathematical point of view the variational problem (1) can have 
no solution because the functional F[{u) — A(u) can be unbounded from below 
on the set V. In particular, after the point Eq = |E*| (see Fig. 1) the full 
potential energy of the electric field TI{u) has growth in ||r^||i j less than linear. 
But the work of the electric field on the external charges A{u) always has linear 
growth in |lu||j As a result, for an admissible minimizing sequence G V 




Limit Analysis Method in Electrostatics 179 




Fig. 1. Experimental (lines 1,2) and effective (line 3) constitutive relations 



with l|t^m|li 1 — > oo we have II {u) — A(u) — oo as m — > oo (details see in 
[7]), i.e. the electrostatic variational problem (1) is not well-posed. Namely, 
the limit electrostatic load exists, i.e. external charges (^, g) with no solution 
of the problem (1). From the physical point of view this effect can be treated 
as the beginning of the electric puncture of dielectric because it corresponds 
to a loss of the electrostatic balance between an external source of charges and 
the dielectrical medium. Here we have the full analogy with some problems of 
global stability and fracture in Mechanics of Solids [2, 4, 5, 17]. 

For a definition of the limit electrostatic load we introduce the set of ad- 
missible external charges for which the minimized functional in problem (1) is 
bounded from below on V and, therefore, a solution of this problem exists: 

B - { (g,g) e L^{Q)xL^{r‘^) : mi{n{u)-A{u) : txG F) > -oo }. 

The set is non-empty because for small external charges the problem (1) is 
transformed into the classic variational problem of linear electrostatics, which 
always has a solution [9]. 

For arbitrary external charges (^o,Po) ^ B we examine the sequence of 
charges, which are proportional to the real parameter t > 0. 

Definition 1 . The number >0 is defined as the limit parameter of electro- 
static loading and the limit electrostatic load, if {t go, t go) G B 

for 0 <t <t^ and {tgo, tgo) ^ B for t > t^. 

As a result, the analysis of the electrostatic balance in a dielectric comes 
to the investigation of the set of positive parameters t for which the one- 
parametric functional 
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It{u) = n{u) — t A{u) 

is bounded from below on the set of admissible electrostatic potentials V from 
(3). The following basic result has been proven recently by the author in [6, 7]. 

Theorem 1. The finite limit parameter of electrostatic loading exists. It is 
calculated as the solution of the following limit analysis problem: 

= A inf I J |Vix(x)| dI2 : u eV, A{u) = 1 ^ , (4) 

where A is the electric saturation from (2). 

From the Definition 1 it follows that for t^ < 1 the electrostatic variational 
problem (1) has no solution. This phenomenon corresponds to the beginning of 
the electric puncture of the dielectrical medium. Therefore, the limit analysis 
problem (4) is the main problem for the estimation of puncture conditions for 
dielectrics of complex shape in powerful nonhomogeneous electric fields that 
closes one of the modern fundamental problems [11, 14, 16]. 

It was pointed out in [6, 7] that the solution of LAP (4) belongs to the 
space of scalar functions with bounded variations BV{Q) D This 

space consists of functions u G L^{f2) having the bounded total variation 



J \du\ = sup ^ J (V-D)^dl? : D G |D| < 1 



where du is the vector Radon’s measure denoting the gradient of the function u 

in the sense of distribution theory [1, 13, 17]. In this case f \du\ — / \ Vu{x) \ df2 

o Q 

for every u G Here and in what follows the point between vectors 

denotes the scalar product of these vectors in 

The Banach space BV{Q) is weak^. sequentially compact with the norm 
= ||'^^|li + j \du\^ therefore, the mathematically correct and fully relaxed 
Q 

LAP has the following form: 



= A inf J \du\ : u G BV{f2)^ = 0, A{u) == 1 j . 

Unfortunately, at present this problem has no a clear physical interpretation. 

Within the framework of the classic approach the original partial relaxation 
of LAP can be used [6, 7]. This relaxation is based on the special discontinuous 
FEA [4, 5]. But after relaxation the appropriate finite dimensional problem be- 
comes ill-conditioned and thus needs special preconditioned numerical methods 
as, for example, presented by the author in [3]. 
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3 Dual LAP 

We construct here the dual LAP for a homogeneous dielectric using methods 
from duality theory [10]. Thus, we introduce the set of admissible fields of the 
effective induction compensating for the electric field of external charges in the 
weak sense [12], 




where n is the unit normal vector on the boundary and V • D is the 
distribution. 

The dual LAP is formulated as the problem of finding an admissible effec- 
tive induction of minimal intensiveness 

r*=inf{||D|U: D€G}, (6) 

where ||D||^ == inf{ sup(|D(x)| : x G i? \ cj) : |ct;| = 0 }. 

Theorem 2. For solutions of the problems (4) and (6) the relation t^r^ — A 
is true. 

Proof. It is easily verified that dual LAP (6) is equivalent to the problem 

= sup { 1 / > 0 : D € G, ||z^D||^ < 1 } . (7) 

For every value u > 0 and field F> G G the following equality is true: 

(n • D — g)u d'j — / ( V • D -h g)u df2 : u eV 

After integration by parts and taking into account the boundary condition on 
we find 



1 / = 1 / 1 / inf 



/ 



v = mi 




uT) ■ VudQ : u £V, A{u) = 




( 8 ) 



We introduce the bilinear functional L(D,ii) = /D • VudQ on 

n 

R^)x [12], then from (8) it follows that the problem (7) has 
the form 
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T* ^ = sup { inf( L(D, u) : u£V, A{u) = 1 ) : D e G, ||D||^ < 1 } . 

For the bilinear functional L(D,u) the classic equality 
sup inf L(D,u) = inf sup L(D^u) 

D w ^ D 

is fulfilled [10], therefore, 

rri=inf{sup(L(D,u); D G G, ||D||<^ < 1) : u&V,A{u) = l}. 

For every vectors E,D G the relation |E| = sup{E • D : |D| < 1} is 
true. As a result, for every u gV we have the equality completing the proof: 

sup{L(D,u): DeG, 1|D||^ < 1} = j |VM(x)ltZf?. 

i? 

We showed that the estimation of the electric durability of a homogeneous 
dielectric is equivalent to finding an effective induction of minimal intensive- 
ness, which compensates the electric field of external charges. As a consequence 
of Theorem 2, if r* > A then the electrostatic variational problem (1) has 
no solution. The problem (6) is fully correct because the admissible effective 
induction has three independent components satisfying only one differential 
equation, i.e. a minimization in two independent components is possible. 



4 Finite-element approximation of dual LAP 

By the standard FEA for the domain i? C R’^ (n = 1, 2, 3) the sets Qh = UT/i 
and Fh — dOh are constructed such that \f2\f2h\ 0 and l-r\iT| — > 0 for 

h +0, where h is the characteristic step of approximation and Th is a simplex 
[8]. Every FEA is described by the set of nodes 

For the admissible effective induction the piecewise linear continuous ap- 
proximation is used [8]: 

Dfe(x) = D''^^fc(x) (A; = 1,2, . . . ,m), 

where D'' = {Gf } € is the admissible effective induction in the node x^, 
— > R is continuous and linear on every simplex scalar function such 
that Fk{x^) = Skr = l,2,...,m). The supp{^k) consists of simplices 
having as a common node. 

After the standard piecewise linear continuous FEA of external charges 
{qh^Qh) and normal vector Hh on the boundary the set of admissible ef- 
fective inductions (5) is approximated by the set 

Gh = {D'= e R® : D'' • V?A^fc(x) + ph(x) = 0, x 6 Qh] 
nh(x’’) ■ D'‘lPk(x’’) = ghix"'), x'" G , 
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which is the convex set with piecewise linear boundaries^ i.e. it is a simplex in 
the space of global variables 

As a result, the dual LAP (6) is approximated by the problem of mathe- 
matical programming with linear equality constraints 

Th = min { max (|D^| : /c = 1, 2, . . . , m) : G } . (9) 

If the number of finite elements equals mi and the number of nodes on 

equals m 2 then the number of free variables in the problem (9) equals 
3m — (mi + m 2 ). It is easily verified that the minimum number of variables 
equals 2n + 1 (n = 1,2,3), which is reached for the domain coinciding with 
the simplest n-dimension simplex because in this case m = n-hl, mi = l, 
m 2 = n + 1. 

The objective functional in the problem (9) is a combination of convex hy- 
percones in the space Therefore, due to the linearity of the constraints in 
the set Gh this finite dimensional problem is effectively solved by the standard 
method of gradient projection^ which is easily adapted for parallel computa- 
tions. 



5 Numerical results 



In the numerical experiment the following electrostatic BVP was considered: 
an isotropic and homogeneous dielectric has the form of a finite cylindrical rod 
with the radius of section a and length 21. The small round blocks of dielectric 
are covered by conductors having charges ±Q. 

In view of the axial symmetry, the initial LAP (4) has the following form: 



t^ = X : u eV, u{r^ 1) = 1}, 



I{u) 



1 1 

•JI 

0 0 




1/2 

rdrdz, 



V=i^ue l)x(0, 1)) : «(r,0) = 0, ^(r,0) = 0, ^{0, ^) = o| , 

where A is the electric saturation from (2) (see Fig. 1), r] — I /a is the geometric 
parameter and r G [0, 1], (/? G [0, 27 t), z G [0, 1] are the reduced cylindrical co- 
ordinates [4, 6]. Here i? = (0, l)x(0, 1) and == {r G [0, l],z = 1}. It is easily 
verified that puncturing charge Q* = '^cilt^. 

From Theorem 2 we have F — X/r^ , where the parameter r* is the solution 
of the appropriate dual LAP (6) on the following set of admissible effective 
inductions: 



G= {D G L°°(I2,R^) : V-D = 0 in i7, = 1 on = 0 in Q] . 
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In the computer experiment a uniform NxN triangulation of the domain 
i7 was used. As a result, the problem of mathematical programming (9) was 
solved for 3(A/’+l)^ variables satisfying 2N‘^-\-N-\-l linear equality constraints. 
The experimental results for A = 1 are shown in Table 1. It is easily seen that 
Th \ = I diS h -\-0 that fully coincides with results presented in [6, 7]. 



Table 1. Numerical results. 



N 


5 


10 


20 


40 


80 


Th 


2.17 


1.83 


1.36 


1.06 


1.01 



The above analytical and numerical results are new. They are of practi- 
cal interest, but more theoretical and experimental research is desirable. For 
example, the presented limit analysis method can be used for the design of 
electric isolators. 
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Summary. We consider the Navier-Stokes equations for incompressible flow in ax- 
isymmetric tubes with abrupt changes of diameter. This paper is based on results for 
the asymptotics of the solution in the vicinity of nonconvex internal angles where the 
velocities possess an expansion n(p, d) = p'^(p{'d) -h . . • , where p, 'd are local spherical 
coordinates. Problems with corners, edges, etc. are often successfully solved by the 
flnite element method using an adaptive strategy usually based on a posteriori error 
estimates. In our paper we suggest an alternative approach for the mesh reflnement 
near the corners, which makes use of the above expansion. It gives very precise re- 
sults in a cheap way. We give numerical results and show the pros and cons of this 
approach. 



1 Introduction 

We consider the Navier-Stokes equations for incompressible fluid flow in ax- 
isymmetric tubes with abrupt changes of diameter. At present, problems with 
corners, edges, etc. are often successfully solved by the finite element method 
using an adaptive strategy based on a posteriori error estimates. Concerning 
linear elliptic equations let us mention the work of I. Babuska and W. C. Rhein- 
boldt [2], concerning the Stokes problem the paper of M. Ainsworth and J.T. 
Oden [I]. Other references can be found in [5]. In [5] we derived an a posteriori 
error estimate for the Stokes problem in a 2D polygonal domain, and in [6] 
also for 3D case. 

In this paper we apply an alternative approach for the mesh refinement 
near the corners, which makes use of the knowledge of the local behaviour of 
the solution near the corners. 

For the stationary Navier-Stokes equations we proved in [4] that for non- 
convex internal angles the velocities near the corners possess an expansion 
u{p,d) = + . . . (smoother terms), where p^d are local spherical coor- 

dinates. E.g. for the angle a = |7 t we have 7 = 0.5444837. It is well-known 
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that using the standard finite element method on triangles with polynomials 
of degree p = 1, 2, 3 we have the a priori error estimate 

IK^i - Uh)\\H^O) < C 

which cannot be improved by increasing the degree of the approximating poly- 
nomials. 

In this paper the local behaviour of the solution near the singular point is 
used to design a priori, a mesh which is adjusted to the shape of the solution. 

The first part of the paper is devoted to the behaviour of the singularity 
near the corner. The second part deals with the impact of the singularity on 
the refinement of the mesh. We show an example of the mesh with quadratic 
polynomials for velocity. Then we use this mesh for the numerical solution of 
flow in the channel with corners. 



2 Steady Navier-Stokes Equations Near the Corner 

The asymptotic behaviour of plane flow with corner singularities has been 
studied e.g. by Kondratiev [11], Ladeveze, Peyret [12]. 

In this section we deal with pipe flow (axially symmetric). To study the 
asymptotic behaviour of the solution of the Navier-Stokes equations for an 
incompressible fluid, we utilize the stream function - vorticity formulation, 
which in cylindrical geometry reads 



du; 



duj du u 

+U2- 
oz or r 



/ d“^u &^u 1 du 



\dz^ 



+ 



dr‘^ 



^ r dr 




d'^^ d'^'il^ 1 d'lp 

dz‘^ dr‘^ r dr ’ 



Ui 



1 dip 
r dr ’ 



U2 



1 dip 
r dz ’ 



( 1 ) 

(2) 

( 3 ) 



where r,z are cylindrical coordinates, u\ = Vz, U 2 = Vr are velocity compo- 
nents in the 2 :, r directions, respectively, u is the vorticity, ip is the stream 
function, and v is the viscosity. We assume that all derivatives exist here at 
least in the generalized sense. 

In [4] we studied the stationary Stokes flow. There, substituting u^ui.,U 2 
from (2) - (3) into (1) we got 



1 av 

= 



' dz'^dr'^ dr"^ 



1 d^ip d^ip d^ip d^ip 3 d"^ip 
r 2 ^ dz'^dr dzdr‘^ dr^ dz'^ 



dz^ 



( 4 ) 



We are interested in the asymptotic behaviour of the solution near the 
corners. One example of our solution domain is shown in Figure 1, where the 
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corners are the points P, Q. The lower edge of the domain coincides with the 
axis of symmetry. 



1 




p 


Q 



Fig. 1. The solution domain Q 



Investigating the equation (4) and using the technique of Kondratiev [11], 
we showed in [4] that near the corners such as P or Q? fhe solution 'ijj possesses 
the expansion 



Pj-i 

'4’{x,y) = P ■ V'sj W + w{x,y), (5) 

j 5=0 

where p, are polar coordinates, w is smooth, and Xj are the poles of multiplic- 
ity pj of the corresponding resolvent P(A). More specifically, for the internal 
angle a = |7t, we proved that the leading term of the expansion for the velocity 
components is as follows 

Uiip, = 1, 2, (6) 

Similar results have been proved for the Navier- Stokes equations. 



3 Finite Element Solution to Steady Navier-Stokes 
Equations 

We are concerned with the finite element solution of the stationary Navier- 
Stokes equations. We would like to use the information about the asymptotics 
of the flow near the singular point, in order to suggest adequate local mesh 
refinement. 

In [4] we showed that the behaviour near the singular point, of the ax- 
isymmetric flow and of the plane flow, are the same. So in what follows, for 
simplicity we deal with plane flow, in the domain Q which has the same shape 
as in Figure 1. We investigate the Navier-Stokes equations in primitive vari- 
ables u (velocity vector) and p (pressure) 

(Vu) • u — uAu + Vp = 0 in i?. (7) 

For the finite element approximation we take i? to be a polygon in E ? , for 
simplicity. Let {Th)h-^o be a regular (cf. [10]) family of triangulations of Q. 

Let be the set of all polynomials of degrees m > 0. Let be 

the finite element spaces of Hood- Taylor elements (cf. e.g. F. Brezzi, M. Fortin 
[3]), i.e. 



X^ = {ne n C{Q), m/k e P\Kf,K G T4, 

M'* = {p G LI{Q),p!k e P\K),K G Th], 



( 8 ) 
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cf. [7]. Velocity values are given in corner nodes and midside nodes of the 
quadrilateral or the triangle, and pressure values only in corner nodes, in order 
to satisfy Babuska-Brezzi condition [3] . Then the velocity components and the 
pressure are approximated as continuous functions of spatial variables. 



4 Refinement of FEM Mesh Adjusted to Singularity 

Near the corner P where the angle is |7 t, velocities have the leading term in 
the expansion as given in (6). There p is the distance from the corner, d the 
angle. Note that > oo for p — » 0. 

Now, in this section we assume the Stokes flow, for simplicity. A priori 
estimate of the finite element error is (cf. [10], [8]) 



||V(u-Uh)|lo + |b-p/i||o < 

- I ^ \h^+^{T)^ + I P ’ 

T T 



(9) 



where k — 2. Taking into account the expansion (6), and following Johnson’s 
idea [9], we derived in [4] the estimate 

Tt 

I ^ ^ J pdp^C (10) 

VT — hT 

where hr is the diameter of the triangle T of a triangulation 7^, and ry is the 
distance of the element T from the corner. So, in order to get an error estimate 
of the order we should guarantee 



This lead us in [4] to an algorithm for generating the mesh near the corner: 

Algorithm. Let r\ be the distance of the large element from the corner. For 
given auxiliary stepsize h we compute recursively: 
for z = 1, 2, . . . , AT : 



hi = h-(riY 

- 



hi 



5 Model Problem 

Consider two-dimensional flow of a viscous, incompressible fluid described by 
the Navier- Stokes equations in a domain with corner singularity, cf. Fig. 2. 
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Fig. 2. Geometry of the channel 



Due to symmetry, we solve the problem only on the upper half of the 
channel, cf. Figs 1, 3. On the inflow we consider a parabolic velocity profile, 
at the outflow ’do nothing’ boundary condition. On the upper wall no-slip 
condition is imposed and on the lower wall a condition of symmetry is assumed 
(i.e. only the y— component of velocity equals 



zero). We consider the following parameters: 
V = 0.0001 m^/s, Uin (max.) = 1 m/s. 

The algorithm for mesh refinement de- 
scribed in the previous section is applied to 
the corner where the channel or tube sud- 
denly decreases in diameter (forward step, 
corresponding to point A in Fig. 1). 

We start with ri = 0, 25mm, h = 0, 1732 mm, 
/c — 2, 7 = 0,5444837. This corresponds to 
the contribution of cca 3% of individual el- 
ements to the global error. This way we get 
ten diameters of elements, cf. Tab. 1. 



Tab. 1. Resulting refinement 



ri(mm) 

0.25000 

0.18685 

0.13575 

0.09526 

0.06396 

0.04054 

0.02374 

0.01235 

0.00528 

0.00147 



hi{mm) 

0.06316 

0.05110 

0.04050 

0.03129 

0.02342 

0.01681 

0.01138 

0.07077 

0.00381 

0.00147 



6 Design of the mesh detail near the corner 

Using the parameters from Table 1, J. Sfstek [13] suggested three variants of 
mesh refinement near the corner (Figures 3-5). Mesh No. 1 was a classical 
mesh used before. He suggested two other variants in order to fit better to 
the algorithm of mesh refinement, especially to its polar coordinate nature 
(Figures 4-5). 



Fig. 3. Mesh No. 1 
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Fig. 4. Mesh No. 2 



Fig. 5. Mesh No. 3 



In Figs. 6-7 we present the whole computational mesh and its detail, using 
Mesh 3. 




Fig. 6. The whole computational mesh 




7 Evaluation of the Approximation Error 

To evaluate the error we use an a posteriori error estimate derived for the 
Stokes problem e.g. in [6]. Here {u,p} = {u,v,p) is the vector of the exact 
solution, {uh,p/i} = (u,v,p) is the approximate solution computed by the 
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FEM. The error e = {eu^ey^Cp) = [u — u^v — v,p — p). The a posteriori 
estimate is: 

W{u — u,v — v^p — p^Q) < S‘^(u,v^p, 12), (12) 

where 

U^{u-u,v-v,p-p,f2) = J2i \\ieu,e^)\\l^^ + ||ep||g^^,, 

S^{u,v,p, Q)=CY,i v,p)) dQ+ rl(u,v,p)df2'j . 

where ri, r 2 a rs denote the residuals with respect to the 1-st and the 2-nd 
N-S equation and the continuity equation (cf. [7], concerning choice of the 
constant C). To evaluate the error on elements we use the modified absolute 
error, defined as 



■^m(‘^,V,P,f2i) 



\n\£‘^{u,v,p, f2i) 
\f2i\U‘^{u, v,p, /?)’ 



(13) 



where |12| is the area of the whole domain dnd |12? | is the mean area of elements 
obtained as \Qi\ = Here n denotes the number of all elements in the 
domain. 



8 Numerical results 

On Figures 8 - 11 we present the graphical output of entities that chart acterize 
the fiow in the channel. On Figs. 8, 9 we observe how strong the singularity is 
both for the velocity and for the pressure (note that here the fiow is from the 
right to the left, to have better view). Figure 8 shows that the location of the 
peak of the singularity of velocity is outside the patch where the refinement 
was done. One can see that, again it is the behaviour of the pressure that is 
decisive, cf. Fig. 9. 




Fig. 8. Velocity component V 

(Mesh 3) Fig. 9. Pressure p (Mesh 3) 
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Fig. 11. Streamlines near the forward 
and backward step. 



On Figures 12-13 we present errors on elements near the corner (forward 
step) for Mesh 1 and Mesh 2. 



9 Conclusions 

Pros: 

— distribution of the error on elements is quite uniform (esp. for Mesh 2); 

— strength of singularity (both for the velocity and the pressure) is well cap- 
tured; 

— the algorithm of adjusted mesh refinement has been confirmed; 

— the efficiency of the algorithm: to achieve the desired precision one needs to 
carry out only one computation (compared with an adaptive approach, the 
same precision would require approximately 10 successive refinements). 

Cons: 

— suitable only for sigularities due to ” geometry” ; 

— adaptive approach is much more robust. 

Nevertheless, efficient refinement of the mesh near the corners still remains a 

challenge. 
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Summary. We give a brief overview of our recent work on the edge stabilization 
method for flow problems. The application examples are convection-diffusion, with 
small diffusion parameter, and a generalized Stokes model. 



1 Introduction 

In order for the standard Galerkin finite element method to be stable for 
problems in CFD, some care must be taken. For convection-dominated flow 
problems, stabilization must be introduced, while for mixed methods the ap- 
proximations of velocities and pressure must either be carefully balanced or, 
again, stabilized. Examples of stabilization methods are the SUPG/SD-method 
[8], the discontinuous Galerkin method [9], the residual free bubbles [2], sub- 
viscosity models [7], and pressure projection methods for the Stokes problem 
[5]. The relation between the different approaches is also well understood in 
most cases. However for complex fiow problems, most of these methods have 
drawbacks. The SUPG stabilization becomes non-symmetric and the formula- 
tion does not permit lumped mass; the residual free bubbles and discontinuous 
Galerkin method add additional degrees of freedom; the projection methods in- 
troduce the need of hierarchical meshes for the projection or the sub- viscosity 
model. In this paper we give a brief review of our recent work, [3, 4], on 
an alternative method originally proposed by Douglas and Dupont [6]. This 
method stabilizes convection- diffusion-reaction problems, as well as equal or- 
der interpolation methods for the generalized Stokes problem, by adding a 
least-squares term based on the jump in the gradient of the discrete solu- 
tion over element boundaries. With this simple concept we obtain stability for 
convection-reaction-diffusion problems also in the vanishing viscosity limit as 
well as for the generalized Stokes problem with equal order interpolation. 

The advantage of this method, in comparison with the others mentioned, 
is that no additional degrees of freedom are added, no hierarchical meshes are 
needed, the formulation remains symmetric, and the mass can be lumped for 
efficient time marching and treatment of stiff source terms. The drawback is 
an increased number of non-zero elements in the stiffness matrix due to the 
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fact that the gradient jump term couple neighboring elements. The implemen- 
tation also requires an element neighbor data structure that is not necessarily 
available in standard finite element codes. 



2 Model problems 

As a first model problem, we consider, in C d = 2,3, the problem of 
solving 

au-\- (3 ' Vu — V • {eSIu) = f in i? (1) 

with, for simplicity, = 0 on dQ. Here, / is a given source term, /3 is a given 
smooth velocity field, satisfying V • /3 0, and a and e are bounded positive 

functions. 

The weak form of this problem is to find u G Hq{Q) such that 

A{u,v) = {f,v) \/v£H^{Q), (2) 



where 

A{u,v) := / (^a uv eVu • Wv p • Vuv) dx and {f,v):= / fvdx. 

J f2 J 

The second model problem that we consider is a generalized Stokes prob- 
lem, given by the partial differential equation 

au — vAu + Vp — / in i7, 

V ' u = q in i7, 

( 3 ) 

u n = d on dQ 
i/u - t = 0 on dQ. 

where Q is bounded polygonal domain in with boundary dQ^ d = 2, 3 and 
a and z/ are two positive parameters, that are not allowed to vanish simulta- 
neously, and / is a force term. By n we denote the outward pointing normal 
to Q and t is a vector orthogonal to n. This problem can be written in weak 
form as follows: Find 

u£V = {v€ : v\an = 0} 

when > 0, alternatively 

ueV = {ve [L‘^{Q)Y, V-v€ : V ■ n\Qn = 0} 

for 1 / = 0, and p € Q = L 2 (J?)/R when > 0, alternatively p € for 

z/ = 0, such that 



a{u, v) + b{p, v) - b{q, u) = L{v, q), \/{v, q) gV xQ, 



( 4 ) 
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where 



a{u,v) — / (JUiVi 4- vVui ' Vvj da:, b{p,v) = — pV-vdx, 

J f2 J iT? 

and 

L{v,q)= / fvdx- gqdx. 

Jn JQ 

In the following, we shall denote the L2-scalar product by (*,•) and the 
corresponding norm by || • ||. 



3 The finite element methods 

For our first model problem, the finite element method consists of seeking 
piecewise polynomial approximation G C and for our second 

model problem Uh and where Uh G C V and Ph ^ C with 
and built from continuous functions. 

Consider a partitioning of f? into a conforming triangulation Th of affine 
simplicies K. We shall be concerned with the approximation 

V^ = {ve n C\Q) : v\k e P\K) WK e Th}, 

and a continuous pressure space, 

Q^ = {q£Qn C\Q) : q\K S P\K) 'iK € Th). 

It is well known that Galerkin methods, based on V^, for convection domi- 
nated problems produce severe oscillations in the presence of rapidly varying 
features of the solution, and that the combination x is unstable for the 
generalized Stokes problem. The stabilization method is similar in both cases 
Our finite element methods are defined as follows. 

Model problem I: find Uh ^ such that 

A{uh,v) + J{uh,v) = {f,v) 'iveV'^, (5) 

where 

= lhl^[^u]-[Vv]ds 

Aw 

= ^ O / • Vm] [uk ■ Vv] ds. 

^ ^ JdK 

Here, hdK is the size of dK^ [q] denotes the jump of q across dK for dKndO = 
0, [q] = 0 on OK fl di? ^ 0, tik is the outward pointing unit normal to K, and 
7 is a constant. We also introduce the local mesh size hx as the largest of the 
hdK associated with element and we will assume that hx/hsK < C where 
C is a fixed constant. 
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Model problem II: Find e x such that 

a{uh, v) + b{ph, v) + j{uh, v) = L{v, 0) in i7, 

Hq,u) -j(ph,q) = L{o,q) in r?, 



( 7 ) 



for all (v,q) G x Q^, where 

j(p , «') := X] ^ f ^ l'>^K ■ '^Ph] [nK ■ Vg] ds 
^ ^ JdK 



( 8 ) 



and 

]{u,v):=J2^ f 7/1^^ [V • «] [V • v] ds, (9) 

For these methods we have, under additional regularity conditions, con- 
sistency in that the exact solution fulfills the discrete equations (because the 
jumps in the continuous variables vanish). What we also need to show is sta- 
bility. The stability estimates obtained using edge stabilization is less imme- 
diate than that obtained in the case of streamline diffusion or discontinuous 
Galerkin. For our first model problem, it is well known (cf. [8]) that the crucial 
term to control is Vit/i||. For a Galerkin method, no control whatsoever 

of this term exists. In a discontinuous Galerkin method one exploits the fact 
that hi^/3 'Vuh is in the finite element test space and hence can be chosen 
as test function, while in a streamline diffusion method it is simply included 
in the test space. In the case of edge stabilization we do not have hic/3 • Vuh 
in the finite element space. However, something which is close is there (e.g., 
its interpolant), and the point is that the difference is controlled by the edge 
stabilization. Thus, thanks to the term J{uh,v)j we get the necessary control 
of \\h]^^l3-Vuh\\. 

For the second model problem the jump terms similarly give us control of 
necessary in the Stokes case and also • Uh\\ needed in the 

Darcy case. We then need to separate the Darcy case, where we use 5 = 1, 
and the Stokes problem, where we use 5 = 2. This is similar to streamline 
diffusion stabilization in the Navier-Stokes case, where the streamline diffusion 
stabilization parameter must change from 0{h\-) to 0{hK) when going from 
viscosity-dominated to convection- dominated situations. 

The details are given in [3, 4]; here we just give an indication of where the 
stability comes from, exemplified using model problem I. More precisely, we 
shall sow that there exists some C ^ Co > 0 such that 



■ Vuh - /3 • Vw)|l^ < (J{uh,Uh) 

for /3 constant. Extensions to /3 G are straighforward. To this end, let AC 
be the set of all triangles containing node i and assume that the cardinality 
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of J\fi is bounded uniformly in i. Let be the set of all test functions (^i such 
that K C suppcpi and f2i = |J^. K^. We will consider a function w G Pq{K)^ 
and its representation in the finite element basis w defined by 

w\k = w\k ^ <Pi- ( 10 ) 

It follows that w = w everywhere except on elements adjacent to Dirichlet 
boundaries where the boundary nodes are not included in the finite element 
space. We have: 

Lemma 1. Suppose that K is an element with at least one node on a Dirichlet 
boundary then 

rii 

where ni denotes the number of interior nodes of the element. 



Proof The proof is immediate noting that 



{w, w) 




^\wKfm{K). 



We will now proceed to prove that 

\\h^^‘^(w — 7Th'w)\\‘^ < CJs{w^w) = f h^~^^[w]‘^ds. 

^ JdK 



The operator tt/^ : Po{K) which denotes the lowest order Clement 

operator is constructed as follows. 



TThW = 






where 



Wi = 



1 

m{f2i) 



Mi 



(12) 



In the following we will also write — 'w\k = with [u;] denoting 

the jump across element boundaries and the sum is taken over the shortest 
“path” from element to element K. 

It is now straightforward to show that the projection error is controlled by 
the operator Js{w^w) 



\\h%‘^{'KhW-w)f=yy^ f h%(^y2 i yZ m{K^))<fi-w^ da: 



= EX^^(E E HK^-MK)m{K^)p’i) 

K i^TK ^ ' K^eMi 



dx 



T. {Em}W) 

K ^ ’ K'eJPi Ki 

<0^1 . 

^ JdK 



‘dx 



h^K^[w]‘^ds < CJs{w,w), 
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where we used the upper bound on the number of triangles neighboring to 
a node and a scaling argument. We have proved the following: 

Lemma 2 . If w is some piecewise constant function, w is defined by (10) and 
TTh is the Clement interpolant on , then the edge stabilization term satisfies 

\\h]^‘^{TThW -w)\\'^ <'fJs{w,w) (13) 

for some 7 > 70 > 0 independent of hx • 

Note that by the construction of w we get less stabilization in elements 
adjacent to Dirichlet boundaries than in the interior of the domain, hence 
we expect to get poorer stabilizing properties close to sharp out flow layers 
(when diffusion is present), something which is confirmed by the numerical 
experiments. 



4 Numerical examples 

4.1 Convection— diffusion 

We consider the domain (0, 1) x (0, 1), with a = 0,s — 10“^, and /3 = (1 — ?/, x). 
The boundary condition on the inflow boundary is it = 0 at 2/ = 0 and u = 0 
at X = 0 except for 0.3 < y < 0.5, where u = 1. We compare the the numerical 
solutions using the present method and the streamline diffusion method in 
Figure 1. The results are of comparable quality. 





Fig. 1. Streamline diffusion (left) and gradient jump stabilization (right) for 
convection-diffusion 



4.2 Stokes problem 

To illustrate the absence of boundary layers in the pressure using the gradient 
jump method, we consider the Poiseulle flow problem on (0,4) x (0, 1) on an 
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unstructured mesh. The inflow boundary velocity is u — (y * (y — 1),0), we 
have set (j = 0 and — 1. At x = 4, we apply a natural boundary condition. 
We solve this problem using P IP 1- approximations with the streamline diffu- 
sion method and the gradient jump method on the mesh shown in Figure 2. 
Also shown is the computed velocity field (indistinguishable for the different 
methods). Finally, in Figure 3, we show the pressure isolines for the streamline 





Fig. 2. Mesh and computed velocity field 



diffusion method (top) and the gradient jump method (bottom). 
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Summary. A solutal, anisotropic phase-field model for dendritic growth of an 
isothermal binary alloy is considered. Existence of a weak solution is established pro- 
vided the physical anisotropy is small enough. A semi-implicit finite element method 
is proposed to solve the problem. A priori and a posteriori error estimates are de- 
rived when the physical anisotropy is small. An adaptive algorithm which aims at 
producing meshes with large aspect ratio is proposed. Numerical results show that 
accurate solutions can be obtained, even when the physical anisotropy is large. 



1 Introduction 

A phase-field model for the dendritic growth in a domain of an isothermal 
binary alloy is considered. The mathematical model contains the physical de- 
scriptions of [22, 20] and consists in a parabolic system of nonlinear equations 
set in Q. The unknowns are the phase-field (/>, which is an order parameter 
taking the value 1 in the solid phase and 0 in the liquid phase (see Fig. 1) 
and the concentration c of the binary alloy. In the phase- field approach, cf) is 
regularized, both (j) and c vary rapidly but smoothly across the thin solid- 
liquid interface. A typical profile of (f) and c across the horizontal abscissa x\ 
is presented in Fig. 2. 




Thickness 

of the solid- liquid 

interface 



Fig. 1. The phase-field (j) 
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(j) c 




Fig. 2. Typical profiles of the phase field (left) and concentration field (right). The 
phase-field has values zero or one, except in the phase change region. The concentra- 
tion field changes rapidly across the phase change region, but may also vary outside 
the phase change region. 



A general mathematical formulation of this solidification problem is the 
following. Given a bounded domain Q of with boundary dQ and outer unit 
normal n, two functions Cq, 0o • G — > R and a time interval (0, T) we consider 
the problem of finding 0, c : i? x (0, T) R such that 



- div (a(V0)V<A)) - S{c, (A) = 0 


in i? X (0,T), 


(1) 


^ - div (i)i(<A)Vc + D 2 (c, <A)V^) = 0 


in i? X (0, T), 


(2) 


A(W<p)'S74> • n = 0 


on df2 X (0, T), 


(3) 


(Di(^)Vc + Z? 2 (c, <A)V(A) • n = 0 


on df2 X (0, T), 


(4) 


<A(0) = <po, c(0) = Co 


in Q. 


(5) 



Here a G R is a positive parameter and A{-) is the matrix defined for 
K' \ {0} by 



^(6 = 



/ a2(0(o) -amy{e{0)\ 

[a{0iO)a'(e{O) a2(0(O) ) ’ 



e € 
( 6 ) 



where 6{^) denotes the angle between the vector ^ and a preferential direction, 
a is the real- valued function defined by 



a{9) = 1 + a cos(k^), 



(7) 



with a > 0 (the anisotropy parameter) and k a positive integer corresponding 
to the number of dendritic branching directions. Without loss of generality, the 
horizontal abscissa xi can be choosed as preferential direction and we have 



cos^(^) = 



ll^ll ’ 



where ei is the unit vector in the horizontal direction. From the physical point 
of view, (1) is nothing but a generalization of a mean curvature - or Allen-Cahn 
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- equation, see for instance [9, 21] for a general presentation. This equation 
is obtained by taking the derivative of a free energy functional accounting 
for phase transformation, double well barriers and an interfacial anisotropic 
energy contribution 



= l [ a\e{Wct>))\V4>fdx. 

J f2 

It can be noticed that the derivative DE of E with respect to V</) is given by 

<DE(V(l))y4^>= [ A{V(t>)V(t)^V^dx, 

JQ 

which is the weak form corresponding to the second term of (1). Due to the 
double well barriers, the function 5(-, •) in (1) contains small parameters that 
force the phase- field (j) to take values 0 or 1, except in a small region, see again 
Fig. 1. Finally, equation (2) corresponds to solute conservation. We refer to 
[22, 20] for details about the physical derivation of the model. 



2 Existence 

Mathematical and numerical studies corresponding to (l)-(5) have already 
been presented in the isotropic case, that is when a = 0 thus A(-) == I. More 
precisely, existence has been proved in [17] using a Faedo-Galerkin method, 
a finite element procedure together with a priori error estimates have been 
proposed in [13] and adaptive finite elements have been presented in [14]. 
Existence of a weak solution in the anisotropic case a ^ 0 has been proved in 
[8]. More precisely, if (f)o,co G L^(i7), if S, Di and D 2 are continuous, bounded 
Lipschitz functions satisfying, for all (/> in M : 

0<D,<Di ((/))< A, (8) 

and if a < ^ 2^1 ? then problem (l)-(5) has a weak solution 

4>, C e L\0, T; H\n)) n H\0, T; 

such that 

+ A(V<^)V</> • ^ S{C, ^)v = 0, 

J Di{(p)Vc -Vw + J D2{c,(f))V(p-Vw = 0, 

for all v,w e a.e. 0 < t < T. Hereabove, < •, • > denotes the duality 

pairing between and its dual. Moreover, if the functions S and D 2 

satisfy, for all 0, c in R 
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S'(c,0) = 5(c,l)=0, 

they can be extended by zero outside the interval (0,1), and a maximum 
principle holds for cf) and c, that is, if 0 < (/>o, cq < 1, then 0 < < 1, 

for all t. 

In order to prove the existence result, the implicit Euler scheme is consid- 
ered. Given an integer N, we set r = T /N the time step, = nr, n = 0, ..., A^, 
(j)^ = (j)o and = Co . For n = 1, 2, ..., A^, 0’^“^ and being known, let 
and be approximations of and c{t'^) such that 

- div (A(V^!>”)V<?i”)) - S{c^-\ cf>^) =0 in f?, (9) 

^ -div (Dl(0")Vc" + £> 2 (c"~^d»")V./>") =0 inf?. (10) 

Existence of (j)'^ is obtained rewriting the first equation as a minimization 
problem. The functional to be minimized is defined by the potential 

m = + ia2(0(V-^))|V<Ap +/(c"-i,^)) , 

where the function / is such that (c, 0) == — 5(c, 0). The existence of a unique 
minimizer is a consequence of the fact that the Ginzburg-Landau potential E(-) 
defined by 

E{0 = I J^aHm)\edx 

is strictly convex G whenever a < ^ 2^1 • 

3 Finite elements and error estimates 

3.1 A priori error estimates 

For any 0 < A < 1, let 7^ be a conforming triangulation of i? into triangles 
K with diameter less than h. Let Vh be the usual finite element space of 
continuous, piecewise linear functions on the triangles of 7^. The finite element 
scheme corresponding to (9) (10) is considered. For each n = 1 ..., A, we are 
first looking for (j)'^ in Vh such that 

[ + [ A{V4>l)yrh -Vvh- [ S{cl-\.j>l)vh = 0 , ( 11 ) 

Jf2 ^ Jn Jn 

for all Vh G Vh and then for in Vh such that 

D,irH)'Vci-vwH+ f D2{cr\rhWrh-^wh = o, ( 12 ) 

^ Jn Jn 
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for all Wh EVh. 

A priori error estimates in the L^(0, T; norm have been proved in 

[5] in the case when a < ~ 2 ~~j • The convergence proof strongly relies on the 
strong monotonicity on the operator A. More precisely, the following result is 
used to prove convergence. 

Lemma 1. Let A(-) be the operator defined by (6) and let the convexity con- 
dition a < - 2^1 hold. Then there exists /i ( depending on d) such that, for all 
(t>, iPeH\f2), we have 

rnhin) < (a(V.^)V<A- A(VV^)VV',V(-^-V’)). (13) 

Then, assuming that the solution (0, c) is smooth the following result is 
proved in [5]. 

Theorem 1. Let c be the weak solution of (l)-(5), let (j)^, cJJ be defined by 
(11)-(12). Assume that 

(<A,c) € 

Then, there is a constant C such that, for all h,r > 0, we have 

rW^irn - 4>in)\\Un) + E^I|V(c;: - cin)\\Un) 

n=l n=l 

<c(r'^ + h‘^ + yy (14) 

3.2 A posteriori error estimates for meshes with high aspect ratio 

As in [15, 16], our goal is to perform adaptive finite elements with high aspect 
ratio for solving numerically (l)-(5) in the general case when d ^ 0. The 
general framework of [10, 11] is considered. 

For any triangle K of the mesh, let Tk : K K he the affine transforma- 
tion which maps the reference triangle K onto K. Let Mk be the Jacobian of 
Tk that is 

X = Tjf(x) = M^X + tK- 

Since Mk is invertible, it admits a singular value decomposition Mk = 
RJ^AkPk, where Rk and Pk are orthogonal and where Ak is diagonal with 
positive entries. In the following we set 

with the choice Xi^k ^ ^ 2 ,K> A simple example of such a transformation is 
xi = Hxi, X 2 = hx 2 ’, with H > h, thus 
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Fig. 3. A simple example of transformation from element K to K 



In the framework of meshes with high aspect ratio, the classical minimum 
angle condition must be avoided. However, it is required that, for each vertex, 
the number of neighbouring vertices is bounded above, uniformly with respect 
to the mesh size h. Also, for any patch Ak (the set of triangles having a vertex 
common with iC), the diameter of the corresponding reference patch that 
is {Ak), must be uniformly bounded independently of h. This latter 

hypothesis excludes some distorted patches, see Fig. 4. Let Ih : Vh 

be a Clement or Scott- Zhang like interpolation operator. We now recall some 
interpolation results due to [10, 11]. 

Lemma 2. There is a constant C depending only on the reference element K 
such that, for all v G for all K £ Th, we have 

||u — Rhv\\\2(^x) + ^2,k\\v — -^/i'^|lL2(aK) — 

Here we have set 



and where Gk{'^) denotes the 2x2 matrix defined by 



( 



Gk{v) = 



f f dv \ ^ ^ f dv dv 

JaK \dxi) ^^ 2 ^ 

f dvdv f 

V 9xi dx2 ^ 



\ 



a. (£;) * 



/ 



For 0 < t < T, we denote 4>h{t) and Ch{t) G Vh the semi-discrete finite 
element approximation corresponding to (11) and (12), respectively. In order 
to prove a posteriori error estimates we need to assume that the error in the 
L^(0, T; L^(i7)) norm converges faster than the error in the L^(0, T; iL^(i?)) 
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Fig. 4. Example of an acceptable patch (top): the size of the reference patch does 
not depend on the aspect ratio H/h. Example of a nonacceptable patch (bottom): 
the size of the reference patch A^ now depends on the aspect ratio H/h. 



norm, that is, there are two constant C > 0 and s €]0, 1] such that, for all 
mesh Th we have 

T 

(||<A - <A/i|li2(f2) + ||c - C/i|||2(/2)) 

< <^(ma^A2,Ar) ^ {\Ni<P ~ <Ph)\\h(^n) + Ch)\\l2(^n))- (15) 

Let us comment this assumption in the frame of meshes with small aspect ratio, 
that is to say when Xi^k and A 2 ,k are of order h, for all G 7^. According to 
the results of section 3.1, the error in the L^(0, T; norm is shown to be 

0{h) whenever the time step is small. On the other hand, 0{K^) convergence 
in the L®°(0, T; L^(i7)) norm is proved for in [13], but only in the case when 
d = 0. Therefore, we expect the following assumption 
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J 0 

T 

< Ch^ ^ (ll V(</> - + II V(c - 

to hold for meshes with small aspect ratio, provided optimal convergence re- 
sults hold in both L^(0, T; and L^(0, T; L^(i7)) norms. Note that this 

assumption has already been used in [14] to obtain a posteriori error estimates 
in the case when a = 0. When considering meshes with high aspect ratio, h 
in the above estimate should be replaced by maxx eTh ^2,K, which yields (15). 
This assumption is checked numerically in section 4. 



For each interior edge of 7^, let us choose an arbitrary normal direction n, 
let [^] denote the jump of ^ across the edge. For each edge of Th lying on the 
boundary df2, we set [^] to twice the inner side value of The following result 
is proved in [6] . 

Theorem 2. Let (j), c be the weak solution of (l)-(5), let (f)h, Ch he the 
semi-discrete approximation corresponding to (11) (12). Assume that 4>,c ^ 
L°°(0, T; i7^(i?)) and that (15) holds. Then, there is a constant C depending 
only on the interpolation constants of Lemma 2 such that, for all mesh Th such 
that meiXK^Th is sujflciently small, we have 



+ 



+ II V(^ - 

^\\{c - CH){T)\\h^a) + ll^(" - 

<fll(-^-<A/.)(0)|li3(^) + |^||(c-c,)(0)||i.(^) 

^ - div {A{V4>hW<Ph) - S{cH,4>h) 



■cf e( 



a- 



LHK) 



+ ~~T/2 \\[M'^4>h)'V(ph ■ n]||i,2(aK) ) X - 4>h) 



\l^(K) 



+ 









LHdK), 



X ljk{c — Ch). (16) 



Here fi is the constant of Lemma 1, Dg is the constant of (8) and M 2 = 
||-^ 2 ||l°°(R 2 ) . Moreover, ujk{') 'Is defined as in Lemma 2. 
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Estimate (16) is not a usual a posteriori error estimate since (j) and c are still 
involved in the right hand side. We then proceed as in [15, 16] and introduce 
an estimator based on super convergent recovery, namely a Zienkiewicz-Zhu 
(Z-Z) like estimator [23, 3, 24]. More precisely, we consider the simplest Z-Z 
error estimator as defined in [19, 1]. The Z-Z error estimator corresponding 
to V(c — c/i) is defined by the difference between Vch and an approximate 
projection of Vch onto namely : 



V (c.) - - 

Here Uh \ g ^ [Ung) G Vh is defined by 



{I - Hh) (^1 



dX2j J 

/ rh{{IIhg)vh) = / gvh ^VheVh, 

Jf2 J Q 



(17) 



where rh denotes the usual Lagrange inter polant. In other words, from constant 
values of Vc/j, on triangles, we build values at vertices P using the formula 



(nh(p^) {P)\ 
\oxi I 

Uh (P^) (P) 

\ \dx 2 j 7 



1 



E |JT| 

KeTn 

PGK 




P£K 



E 1^1 

\KeTh 

\ PGK 



dch \ 
9X2/ \ ^ 



From [2, 19] we know that for a certain class of meshes (namely parallel meshes) 
and for smooth solutions, Z-Z like error estimators are asymptotically exact 
(i.e. the Z-Z error estimator converges to the true error when h goes to zero). 
Our error indicator corresponding to c — Ch is then obtained by replacing the 
matrix GK{c — Ch) present in the definition of ujK{c — Ch) in (16) by the matrix 
<5 k(c/i) defined by 



Gk{ch) 



' f iVi^Mfdx j 7]f^{ch)v2^ich)dx 

-JK r 

I Vi^ {ch)r] 2 ^ {ch)dx I {rj 2 ^{ch)fdx 

^JK Jk 



(18) 



Finally, our simplified error indicator corresponding to the concentration error 
c — Ch is defined on each triangle K by 



f 






dch 

dn 



L^(dK) 



{^IkGk (cfc)ri,Ar) + {ch)v2,K) ^ 



1/2 



(19) 
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4 Numerical study of the effect ivity index for small time 
steps 



In this section, the quality of our error indicator (19) is investigated numeri- 
cally. For details we refer to [6]. Let us consider (j)'^ and G Vh the solutions 
of (11) and (12), respectively. In practice, (j)'^ is obtained by performing only 
one Newton iteration at each time step. Proceeding as in [16] we introduce Chr 
the continuous, piecewise linear approximation in time defined by 

t — — t 

Chr{x,t) = H — X G i7. (20) 

and the simplified error indicator for each time interval and triangle 

i^by 



{vnMchr)) j 



1/2 

2,K 



dn 



\L‘^{dK) 






where GKichr) is defined as in (18). 

We consider the model of [20], the notations being those of [12]. The source 
term S in (1) is defined by 



5(c, « = - m - - «) 

if 0 < (/) < 1, 



whereas 5(c, </>) — 0 if (/> < 0 or 0 > 1 . Here A is the thickness of the solid-liquid 
interface, mi is the liquid slope in the phase diagram, F is the Gibbs-Thomson 
isotropic coefficient, ci is the liquid concentration in the phase diagram, and 
k is the phase diagram partition coefficient (thus Cg = kci^ where Cg is the 
solid concentration in the phase diagram). The coefficient a in (1) equals 
where fik is the interface kinetic coefficient. The first term in the definition of 
*S'(-, •) is nothing but the derivative of a double well potential that forces (j) to 
values zero or one, except in the phase change region. An asymptotic expansion 
of the phase-field equation at first order in A links the normal velocity of the 
solid-liquid interface, some anisotropic measure of the interface curvature, and 
the concentration field. The function D\ in (2) is given by 

DiW = £>,+ . (Di-Ds) if0<^<l, 

1 — 0 + 



whereas Di(</>) — Di \i (j) < () and Di{cj)) = Dg ii \ < cj). Here Dg and Di 
are the solid and liquid diffusion coefficients. Finally, the function D 2 in (2) is 
given by 



D2{c,<l^) = D,i^) 



(1 — k)c{l — c) 
1 — 0 + 



if 0 < c < 1, 
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whereas -D 2 (c, 0 if c < 0 or c > 1. All the physical parameters are given 

below in the international MKSA unit system. 

Our first goal is to validate numerically assumption (15) in the context 
of meshes with high aspect ratio. For this purpose, we set the computational 
domain to i? = [—0.0002,0.0002]^, we add source terms in (1) (2) so that 0 
and c are given by 

1 — tanh 

^{xi,X2,t) = c{xi,X2,t) = 

where v = 2 10“^ and 5 = 10“®. Dirichlet boundary conditions are prescribed 
on the vertical sides of f7, homogeneous Neumann boundary conditions on 
the horizontal sides. The physical parameters involved in the definition of 5, 
Di and D 2 are given in Tab. 1, the time step is r == 5 10“^ and is small 
enough in order to overkill the error due to time discretization. Meshes with 
high aspect ratio are used to validate assumption (15). In Tab. 2, errors in the 
L^(0, T; L^(i?)) and L^(0, T; iJ^(i?)) norms (resp. 6^2 and ejji) are reported 
when using distorded meshes {hi — h2 denotes the mesh size in horizontal and 
vertical directions). Also, the effectivity indices (the ratio between the error 
estimator and the true error) and corresponding to the Zienkiewicz- 
Zhu error estimator (17) and our simplified error indicator (21) are shown. 
Clearly, when the mesh is refined in the horizontal direction, assumption (15) 
holds with 5 == 1 since the L^(0, T; L^(i7)) error converges at order two and 
the L^(0, T; jF7^(i7)) error at rate one. However, when the mesh is refined in 
the wrong (vertical) direction, then the error does not decrease and (15) does 
not hold. 




Table 1. Test case with exact solution : parameters used for the computations. 



A mi r Cs Cl Ds Di fik 
10“^ -260 0.1 0.015 0.0238 5 10“^° 5 10~® 0.0015 



5 An adaptive algorithm generating meshes with high 
aspect ratio 

We now present the adaptive algorithm of [6]. Given a time step r, the goal 
is to build triangulations n = 1, ...,N having high aspect ratio such that 
the relative estimated error in the L‘^{0,T; H^{0)) norm is close to a preset 
tolerance TOL. Our adaptive algorithm aims at building triangulations 
n = 1, ..., N such that 
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Table 2. Various convergence results for the travelling wave solution. 



hl-h2 


Cl2 


e^i 


ei 


ei^ 


0.000005 - 0.0001 


1.1 10“® 


0.29 


1.01 


1.85 


0.0000025 - 0.00005 


3.2 10“’’ 


0.13 


1.01 


1.74 


0.00000125 - 0.000025 


00 

bo 

o 

1 

00 


0.066 


1.01 


1.74 



Anisotropic meshes refined in both horizontal and vertical directions 



hl-h2 


ej^2 


e^i 




e% 


0.000005 - 0.0001 


1.1 1Q-® 


0.29 


1.01 


1.85 


0.0000025 - 0.0001 


4.4 10“’’ 


0.14 


1.00 


1.79 


0.00000125 - 0.0001 


1.2 10-* 


0.061 


1.00 


1.79 



Anisotropic meshes refined in horizontal direction only 



hX - h2 


6l2 








0.00005 - 0.00001 


4.0 10“® 


4.3 


0.85 


1.67 


0.00005 - 0.000005 


3.2 10“® 


5.3 


0.86 


1.87 


0.00005 - 0.0000025 


4.2 10-® 


8.48 


0.86 


1.54 



Anisotropic meshes refined in vertical direction only 



^ Y2 (jln,K{Chr)) 

0.75 TOL < < 1.25 TOL, (22) 

/ / IVc.p 

Jo J Q 

where r]n^K{chr) defined by (21). A sufficient condition to satisfy (22) is to 
build, for each n — 1, A^, an anisotropic triangulation such that 



0.752TOL2 

NVi^ 






1.252TOL2 rt- r 

NV^ L-Ja 



\Vchr\^ 



for all triangle K G where is the number of vertices of the mesh . 

We then proceed as in [15, 16] to build such an anisotropic mesh, using the 
BL2D mesh generator [4]. 



5.1 Computations with small anisotropy 

We now consider the following physical situation. At initial time, the computa- 
tional domain is liquid, with homogeneous concentration 0.02. Then, a circular 
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solid seed of diameter 2.5 10~^ and concentration 0.015 is placed at the center 
of The physical parameters are now given in Tab. 3 and are taken from 
[12], table 1, column B, except Cg and ci. 



Table 3. Parameters used for the computations. 

A mi T ~Cs Cl Ws A ~flk I 

0.5 1Q~^ -260 5 IQ-^ 0.015 0.0238 5 1Q~^^ 5 1Q~^ 0.Q015| 



We first present computations in the case when the anisotropy parameter 
a is small. We set the number of dendrite arms k = A and choose a = 0.04 so 
that a < — 0.0667. The time step is r = 5 10“^ and the final time is 

T == 1, making the total number of time steps 2000. 

In Fig. 5, the adapted meshes, concentration and phase fields corresponding 
to an adaptive computation with tolerance TOL = 0.0625 (6.25% estimated 
relative error) are reported. The concentration c and phase cj) appear to be 
smooth, but exhibit strong gradients across the solid to liquid transition zone, 
therefore the mesh is strongly refined in the neighbourhood of the solid-liquid 
interface. Zooms of the results are shown in Fig. 6. The adaptive algorithm 
generates 400 meshes from initial to final time. The computation takes about 
4 hours on a Pentium III 1.2 Ghz PC, with a required memory of less than 
300 Mb. The maximum aspect ratio of the generated meshes is approximately 
30, without any a priori upper bound imposed by the adaptive method. 

5.2 Computations with large anisotropy 

We now choose the anisotropy parameter a > — 0.0667, namely 

a = 0.1. In this case there are no known existence results for the system 
in L2(0,T; 77^(12)). 

In Fig. 7, the concentration fields corresponding to an adaptive computa- 
tion with tolerance TOL = 0.0625 are reported. A zoom of the results at final 
time. Fig. 8, shows that the gradient is discontinuous in some regions. This 
phenomenon is explained in details in [6]. 



6 Conclusions and perspectives 

A phase-field model corresponding to the isothermal, dendritic growth of a bi- 
nary alloy is considered. Existence is proved when the physical anisotropy is 
small. A finite element method is proposed and a priori error estimates are 
obtained, again when the physical anisotropy is small. A posteriori error esti- 
mates are derived for meshes with large aspect ratio and a numerical study of 
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Fig. 5. Computations with small anisotropy, a = 0.04. Adapted meshes (left col- 
umn), concentration isovalues (middle column) and phase isovalues (right column), 
from t — 0tot = ls, with TOL = 0.0625 (6.25% estimated relative error). Row 1: 
t = 0.05 s, 6874 vertices. Row 2: t = 0.5 s, 17170 vertices. Row 3: t = 1 s, 24441 
vertices. 

the effectivity index is proposed. Finally, an adaptive algorithm that generates 
successive meshes with high aspect ratio is presented. 

Numerical results corresponding to dendritic growth are then presented. 
When the physical anisotropy is small, the numerical solution is smooth 
whereas, when the physical anisotropy exceeds the predicted value, then ir- 
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Fig. 6. Computations with small anisotropy, a — 0.04. Zooms of the mesh at final 
time. 




Fig. 7. Computations with large anisotropy, a = 0.1. Concentration isovalues from 
^ = 0 to t = 1 s, with TOL — 0.0625 (6.25% estimated relative error). Left : 
t = 0.05 s, 5138 vertices. Middle : t = 0.5 s, 20580 vertices. Right : t = 1 s, 29971 
vertices. 




Fig. 8. Computations with large anisotropy, a = 0.1. Zooms of the mesh at final 
time. 



regular dendritic shapes are obtained. We are looking forward to obtaining 
similar theoretical results for the multiphase-field model described in [18, 7]. 
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Summary. We devise a family of discontinuous Galerkin methods for the Timo- 
shenko beam problem. Sufficient conditions for the existence and uniqueness of the 
approximation are given. The method allows arbitrary meshes and arbitrary poly- 
nomial degrees within the mesh, and hence is suitable for hp adaptivity. Numerical 
results showing optimal and exponential convergence are provided. These features of 
the method render it appealing for other problems in structure mechanics such as, 
plates, shells etc. 



1 Introduction 



In this paper, we introduce and numerically study discontinuous Galerkin (DG) 
for Timoshenko beams. The Timoshenko beam model, see [1] and [2], can be 
written as 



^ = 0U) - — 

dx ^ ‘ {GA){xy dx 



M{x) 

{eW)' 






( 1 ) 



where x e O = (0,L). Here, the unknowns are the transverse displacement 
w, the rotation of the transverse cross-section of the beam the bending 
moment M, and the shear force T. The material and geometrical properties of 
the beam are characterized by the shear modulus G, the cross-section area A, 
the Young modulus E, and the moment of inertia I. The transverse load, q, is 
part of data of the problem. To complete the model and ensure the existence 
and uniqueness of its solution, we must impose suitable boundary conditions; 
we take, for example. 



u;(0) — u;o M{0) = Mq w{L)=wj:, M{L) = Ml- (2) 

Our long-term goal is to investigate the possible advantages of DG methods 
in computational structural mechanics. In this paper, we begin our efforts by 
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studying how to properly devise DG methods for the Timoshenko beam. In 
a forthcoming paper, we give a complete error analysis of these methods and 
show that they can easily overcome the so-called shear locking. 

In [4] Arnold analyzed the continuous version of the Galerkin method. He 
proved error estimates which degenerate as the aspect ration of the beam 
tends to zero, and hence the method suffers from shear locking. In the same 
paper he proved that locking is overcome if we use the so-called reduced inte- 
gration technique. These findings are verified by numerical experiments. For 
the relationship between mixed finite element methods and reduced integra- 
tion techniques we refer to [5]. In [6], Li analyzed the p and hp versions of 
the same method and proved error estimates independent of the aspect ra- 
tio of the beam. This is consistent with the well known fact that locking can 
be overcome by increasing the polynomial degree of the approximations. Our 
preliminary results indicate that the DG methods overcome locking even if we 
approximate all the unknowns with piecewise constants and do not use reduced 
integration. This and other features render DG appealing for other problems 
in structure mechanics, such as plates, shells and elasticity. Of course, there 
are additinal issues involved in these problems; Reissner-Mindlin plates have 
boundary layers, elasticity problems have volumetric locking and shells exhibit 
membrane locking. For a recent DG method for the Reissner-Mindlin plate see 

[7] . Arnold and Falk provided a theoretical analysis of the boundary layers in 

[8] , and a uniformly accurate finite element method in [9]. For a locking- free 
finite element method for shells see [10]. 

The paper is organized as follows. In section 2, we introduce the weak 
formulation we are going to use to define the DG methods. Then, in section 
3, we introduce their general form. In section 4, we introduce what we call 
a discrete energy identity which we use, in section 5, to establish conditions 
that ensure existence and uniqueness of the approximate solution. In section 
6, we present numerical results showing that the method can achieve optimal 
/i- convergence as well as exponential convergence. We end in section 7 with 
some concluding remarks. 



2 The weak formulation for the continuous case 

To display the weak formulation we use to define the DG methods, we need 
to introduce some notations. Let T — {Ij = = 1,...,AT} be a 

triangulation of the computational domain Q = we assume that the 

nodes Xi are such that 0 = xq < x\ < • • • < xjsf-i < xjsf = L. Then, we write 

iV N 

j=i ■'h j=0 

Here, R is a function defined on the set of nodes •= {^o, • • • , 

The jump of the function |v^n], is defined as follows. If the node e is in 
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{xi,X 2 , . . . ,xa/--i}, then we take |(/?nj(e) == (/?(e+)n+ + (p{e~)n~ , where 
(p(e^) := linieio nf = =pl. For the boundary nodes, we take 

|(/?nj(0) == — (/:?(0+), |(^nJ(L) = (p{L~). These jumps are well defined for (p in 
H^{f2h), where Qh = 

It is now easy to see that if we assume that (T^ M,6,w) £ we 

have 

-(w, +(u;, |t;in|) = (9,v^) - {-^T,v^), (3a) 

(3b) 

- (M, ^ |^;3nl) = (T, v% (3c) 

+{T,lvM) ={Qy)- (3d) 

for all G H^{Qh)- This is the weak formulation we use to define 

the DG methods. 



3 The DG Methods 

The approximate solution {Th, Mh^Oh^Wh) given by the DG method will be 
sought in the finite dimensional space x x x here, 

-{v.Hh^R: v\i^ e P’^{Ij),j = 1,...,N}, 

where P^{K) is the set of all polynomials on K of degree not exceeding k. The 
approximate solution is determined by requiring that 



-(w/,,Tt,i) +(u;^,|u^nl) 


= ieh,v^)-i-^n,v^), 


(4a) 


- {Oh, -^v^) + (fih, Iv^n|) 


= {±Mh,v^), 


(4b) 


- {Mh, -^v^) + {Mh, [u^n]) 


= {Th,v% 


(4c) 


+ 

1 


4^ 

II 


(4d) 



hold for all v'^ G for z = 1, 2, 3, 4. To complete the definition of the method, 
we have to define the numerical traces {Th^ Mh^0h,Wh) at the nodes. It is 
through them that the interaction between the degrees of freedom of differ- 
ent intervals is introduced and the boundary conditions are actually imposed. 
Moreover, their choice is crucial as it affects both the stability and the accu- 
racy of the method; see [3] for a detailed discussion of this issue for some other 
problems. 
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Extending to our framework what have been already done for fluid flow 
problems, we assume that the form of these traces is as follows. For an interior 



node e G we take 

'^h — J + Cii[whTil + C'i2|^/in| + CislMhTLj + Ci4|T/in], (5a) 

^ ^ + C2llWhnl + C22|^^^l 4- C23|M/in] + C'24|?)i^l, (5b) 

Mh — + Csilwhuj + C32|^^nJ + CsslMbn} + C34|T/j,nJ, (5c) 

Th = C4ilwhTil + (^421^/1 n| + C^slMhnl + C^^lThn}, (5d) 

where f v^J'(e) = ^((/?(e+) + cp{e~)). At x = 0, we take 

Wh{0)=wo, (6a) 

^0) = OhiO^) + C2i(0)(u;o - wh{0^)) + C23(0)(Mo - M^(0+)), (6b) 

R(0) - Mo, (6c) 

fh{0) = T^0+) + C4i(0)(u;o - u;^(0+)) + C43(0 )(Mq - M^O^)). (6d) 

and at X = L, 

Wh{L) = wl, (7a) 

^4L) = 9h{L-) + C2i{L){wh{L-) - wl) + C23{L){Mh{L-) - Ml). (7b) 
Mh{L) ^ Ml. (7c) 

Th{L) = Th{L~) + C4i{L){wh{L-) - wl) + ^43(1) (M;,(L-) - Ml). (7d) 



Note how the boundary conditions are incorporated into the DG method 
through the definition of the numerical traces at the border. Note also that 
the parameters Cij defining the numerical traces can have different values at 
different nodes. In the next section, we investigate the role of these parame- 
ters. In particular, we show that out of these sixteen parameters, six can be 
expressed in terms of the remaining ten and that only four of them have an 
impact on the “energy” of the discretization. 



4 The Discrete Energy Identity 

To see this, we consider a classical energy argument. It is not difficult to see 
that if we take = T^ v‘^ — — M, — —6^ and = w m the equations (3), 
and add them, we obtain the energy identity 

(A M, M) + (^T, T) = {q, w) + fec(T, M, 6, w), 



where 
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6c(T, M, 6>, w) - Ml 0 {L~) - Mo ^(0+) - wl T{L~) + wo T(0+). 

Since this identity captures an essential feature of the problem under consid- 
eration, we would like to obtain a similar energy identity for the DG method. 
Such an identity is obtained in the following result. 

Proposition 1 (Discrete Energy Identity). Assume that {Th^ Mh^Oh^Wh) 
is a solution of the DG method given by (4)^ with numerical traces given by 
(5), (6), and (7). Moreover, assume that for all nodes e G we have 

C21 = C43, — C22 = Css, C24 = <^ 13 , Csi = C42, C34 = C12, —Cii = C44. 

( 8 ) 

Then, we have 

{^Mh,Mh)-¥{-^^h,Th )-\-0 jumps = {q,'^h)^-hc{Th,Mh,Oh,Wh)-\-Ohc, (9) 

Here, setting C14 — C32 — 0 at the boundary nodes, we have 

Ojumps = ^ {CulThnf - C 23 lMhTif - C32|6>hnl^ + C4i|wfcn|^) (e), 
eeSh 

Obc = wo[C4iw(0+) - C2 iM(0+)] + Mo[C43W(0+) - C23-M(0+)] 

+ wl[Caiw{L-) - C2iM{L-)\ + Ml[Cazw{L-) - C^zMiL-)]. 

Proof The proof of the above result follows by mimicking what was done for 
the continuous case, that is, by taking = T/j, = —Mh, = —Oh, and 
= Wh in the definition of the DG method (4), adding the resulting equations, 
and carrying out some simple algebraic manipulations. □ 

It is now clear that if we take 

<^ 14 , —C23, —Cs2, C41 > 0 , ( 10 ) 

then each of the terms of O jumps can be considered to be an energy associated 
with the discontinuous nature of the discretization. Thus, the above condition 
ensures that the appearance of the jumps in the DG approximation is accom- 
panied by an increase of the total energy of the system. Since this can also 
be thought of as being a stabilizing effect, the above parameters are called 
the stabilization parameters. None of the remaining parameters appear in the 
expression for the energy of the approximation, as we can see in the above 
result. 



5 Existence and uniqueness of the DG approximation 

Our main theoretical result is the following. 
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Theorem 1 (Existence and uniqueness of the DG approximation). 

Consider the DG method defined by the weak formulation (4) and the numerical 
traces (5)^ (6) and (7). Assume that the constants Cij satisfy (8) and (10). 
Then the method has a unique solution in the following cases: 

Case 1; C41 > 0 on S’h, —C32 > 0 on k2 > ks — 1, and ki > k4 — 1. 

Case 2; Cij = 0 on except Cu = C22 = —^"33 = —C44 — 1/2, 

C4i{L) > 0, k2 > ks and k\ > k4. 

Case 3.* k2 ^ k^ 1 and k\ ^ /C4 1 . 

From the first case, we see that the stabilization parameters associated 
to 9h and namely, C32 and C41, respectively, have a stronger influence 
on the existence and uniqueness of the solution of the method than the ones 
associated to Th and Mh-, namely, C14 and C23, respectively. 

From the second and third cases, we see that when the stabilization effect 
of a jump is turned off (by setting the corresponding stabilization parameter 
equal to zero), the existence and uniqueness of the approximate solution can 
still be guaranteed by a suitable definition of the other parameters and/or 
by modifying the polynomial degree of the approximate solutions. Roughly 
speaking, the more stabilization parameters are equal to zero, the more the 
spaces for 6h and Wh have to be in relation to the spaces of Mh and Th-, 
respectively. 

Proof Due to the linearity of the problem, it is enough to show that the only 
solution to (4) with q = 0 , wq = — Mq == Ml = 0 is Wh = Oh = Mh = 

Th = 0. In this case, (9) takes the form 

Mh) + ^ jumps = 0 - ( 11 ) 

By the assumptions on the parameters Cij, this implies = 0, Mh = 0. On 
the other hand, taking = 1 in (4a), we get that (Oh,l) =0. 

Case 1. In the first case, from the discrete energy identity (11) we see 
that lOhnj = 0 on and = 0 on As a consequence, if ks = 0, 

Oh is a constant and since {Oh^ 1) = 0, it is equal to zero. If >0, a similar 
conclusion can be reached as follows: We have, by (5b), that Oh = Oh and, by 
(4b), that = 0 for all v‘^ G Since Oh G and k2 > ks — l^ this 

implies that Oh is a constant and since {Oh, 1) = 0, that it is equal to zero. A 
similar argument shows that Wh is also zero if k\ > /C4 — 1. 

Case 2. In the second case, by the de^nition of the numerical trace of 
Oh, we have that 0h{0) — 0h{0^) and that 0h{e) — 0h{e~) for all the remain- 
ing nodes e. Taking with support {xq,xi) we get, by equation (4b), that 
- Ixo + ) - i^h £ 9 h = 0. Since 9 h G V;''" 

and k2 > ks this implies that = 0 on that interval. Now, if we take 
with support on {xi,X2), equation (4b) becomes (^2 ) (^2 ) • 

Taking 1, we get that 0 h{x 2 ) = 0. Also, since k2 > ks, we can take 
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= Qh and obtain that = 0. Hence, = 0, and so = 0 

in (xi,X2). By repeating this argument, we obtain that 0^ = 0. Similarly, it is 
easy to show that w^, = 0 outside the last interval (xjv-i, xjy). By the equa- 
tion (4a), we have ^ -^Wh = 0, and hence Wh is a constant on the last 

interval since Wh G and k\ > /C4. Finally, since C4i(L) > 0, by the discrete 

energy identity (11), we have that lwhnj{L) — 0 and so Wh{L~) = 0. This 
implies that Wh = 0. 

Case 3. Finally, consider the third case. Since /c 2 > 1, taking = x/L 
in (4b), we get that 0h{L) = (^/i,l) = 0. Then, taking = 1 on {xj,Xj^i) 
and = 0 for the rest of the domain, the equation (4b) yields 0h{xj) = 
6h{xjj^i). This implies that = 0 on all the nodes. By (4b), this implies that 
(^/i, -^v^) = 0 for all G ^ and since Oh G and k2 > + 1, this 

implies that 9h = 0. A similar argument shows that Wh = 0 if k\ > /C 4 + 1. 

This completes the proof. □ 



6 Numerical Results 

In this section, we display preliminary numerical experiments showing that 
DG methods for the Timoshenko beam can be constructed so as to achieve 
optimal rates of convergence as well as exponential convergence. We consider 
two test problems. In both problems, same-degree polynomial approximations 
are used for all the unknowns, that is, we take k\ = k2 = ks = k^ — k. 

Test problem 1. We solve (l)-(2) in i? = (0, 1) with the boundary condi- 
tions Wo = u>L = 0 and Mo = Ml = 0. We take q{x) = sin(7rx), for all x G 17. 
The transverse cross-section area of the beam at the point x, A(x), is assumed 
to be rectangular with width and height a(x) and &(x); the moment of inertia 
is given by the formula /(x) = a{x)b^{x)/12. We take a(x) = b{x) = 0.01 The 
Young modulus is = 10^ whereas the shear modulus is G = (0.35)£'. 

We pick the numerical traces by setting Cij{x) = 0 ior i ^ j except — 
C23(0) - C 4 i(L) - 1/h. And Cii(x) = C22{x) = -Cssix) = -C44{x) - 0.5. 
The existence and uniqueness of the approximate solution is guaranteed by 
the second case of Theorem 1 . 

Test problem 2. The purpose of this test problem is to show that the 
method can easily handle other boundary conditions and discontinuities in the 
material properties and the load. Thus, we take the boundary conditions wq = 
0 ^ Oq = 0 , Ml = 0 and Tl = 0. We also take q{x) = sin(7rx) if0<x<l/2 and 
q{x) = — 16(x — 3/4)^ + 1 if 1/2 < x < 1. a(x) = b{x) = 0.02 if 0 < x < 1/2 
and 0.05 if 1/2 < x < 1. 

We use the following numerical traces. To capture the boundary con- 
ditions, we take Wh{^) = 0,^/i(0) = 0,Af^(L) = 0, and Th{L) = 0. To 
define the remaining numerical traces, we simply take Cij = 0 except for 
Cii = C 22 = = —C44 = 1/2. The existence and uniqueness of the ap- 

proximate solution is not guaranteed by Theorem 1, but can be easily proven. 
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in Fig. 1, we see that in both test problems, exponential convergence is 
achieved. In Figs. 2, and 3, the solid line represents the exact solution and 
represents the numerical trace of the approximate solution at the nodes 
of the mesh. We display the results obtained on a uniform mesh of 10 elements. 
In the Tables 1 and 2, we display convergence orders up to polynomial degree 
k = 5. Mesh number i means a uniform mesh of 2^ elements. We can see that 
optimal orders of convergence are achieved for all the unknowns even in the 
case of piecewise constant approximation. 





Fig. 1. Exponential convergence: Test problem 1 (left) and 2 (right) 




Fig. 2. Piecewise constant (top) and linear (bottom) approximations for test prob- 
lem 1 
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Table 1. Convergence rates for same-degree approximation for test problem 1. 



degree 


mesh 


l|eT||i,2(n) 


W^M 1 1 




11^-^ II 1/2(0) 


k 


number 


error 


order 


error 


order 


error 


order 


error 


order 




4 


1.6e-02 


0.99 


9.3e-03 


0.98 


1.6e-01 


1.00 


9.5e-02 


0.99 


0 


5 


8.0e-03 


1.00 


4.7e-03 


1.00 


7.9e-02 


1.00 


4.8e-02 


1.00 




6 


4.0e-03 


1.00 


2.3e-03 


1.00 


3.9e-02 


1.00 


2.4e-02 


1.00 




4 


5.3e-04 


2.00 


2.5e-04 


2.70 


6.4e-03 


1.99 


2.0e-03 


2.01 


1 


5 


1.3e-04 


2.00 


4.6e-05 


2.47 


1.6e-03 


1.99 


5.1e-04 


2.00 




6 


3.3e-05 


2.00 


l.le-05 


2.11 


4.0e-04 


2.00 


1.3e-04 


2.00 




4 


8.3e-06 


3.00 


2.7e-06 


3.24 


l.Oe-04 


2.99 


3.2e-05 


3.00 


2 


5 


l.Oe-06 


3.00 


3.3e-07 


3.04 


1.3e-05 


3.00 


4.0e-06 


3.00 




6 


1.3e-07 


3.00 


4.1e-08 


3.00 


1.6e-06 


3.00 


5.0e-07 


3.00 




4 


l.Oe-07 


4.00 


4.9e-08 


4.92 


1.2e-06 


3.97 


3.9e-07 


4.00 


3 


5 


6.3e-09 


4.00 


2.2e-09 


4.49 


7.6e-08 


3.99 


2.4e-08 


4.00 




6 


3.9e-10 


4.00 


1.3e-10 


4.11 


4.8e-09 


4.00 


1.5e-09 


4.00 




4 


9.8e-10 


5.00 


3.2e-10 


5.31 


1.2e-08 


5.00 


3.8e-09 


5.00 


4 


5 


3.1e-ll 


5.00 


9.7e-12 


5.04 


3.7e-10 


5.00 


1.2e-10 


5.00 




6 


9.5e-13 


5.00 


3.0e-13 


5.00 


1.2e-ll 


5.00 


3.7e-12 


5.00 




3 


5.1e-10 


5.99 


4.9e-10 


6.83 


6.1e-09 


5.93 


2.0e-09 


6.01 


5 


4 


7.9e-12 


6.00 


3.9e-12 


6.99 


9.6e-ll 


5.98 


3.1e-ll 


6.00 




5 


1.2e-13 


5.99 


4.3e-14 


6.50 


1.5e-12 


6.00 


4.8e-13 


6.00 




Fig. 3. Piecewise constant (top) and linear (bottom) approximations for test prob- 
lem 2 
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Table 2. Convergence rates for same-degree approximation for test problem 2. 



degree 


mesh 


l|eT||£,2(n) 


||eM||i,2(^2) 


il^^llL2(i7) 


1 1 lx. 


2(0) 


k 


number 


error 


order 


error 


order 


error 


order 


error 


order 




2 


9.7e-2 


0.89 


l.le-1 


1.04 


5.7e-l 


1.14 


4.7e-l 


1.32 


0 


3 


5.1e-2 


0.94 


5.6e-2 


1.04 


2.7e-l 


1.09 


2.1e-l 


1.19 




4 


2.6e-2 


0.98 


2.7e-2 


1.03 


1.3e-l 


1.05 


9.6e-2 


1.10 




2 


1.3e-2 


1.25 


2.7e-3 


1.94 


l.le-2 


1.99 


4.2e-3 


2.04 


1 


3 


3.4e-3 


1.89 


6.7e-4 


1.99 


2.8e-3 


2.00 


l.le-3 


1.98 




4 


8.6e-4 


1.98 


1.7e-4 


1.99 


6.9e-4 


2.00 


2.7e-4 


1.98 




2 


1.8e-3 


3.00 


2.6e-4 


2.61 


2.8e-4 


2.92 


2.1e-4 


2.95 


2 


3 


2.2e-4 


3.00 


3.4e-5 


2.93 


3.5e-5 


2.97 


2.7e-5 


2.98 




4 


2.8e-5 


3.00 


4.3e-6 


2.99 


4.5e-6 


2.99 


3.4e-6 


2.99 




2 


1.8e-5 


4.01 


2.7e-5 


3.99 


1.4e-5 


3.96 


4.5e-6 


4.06 


3 


3 


l.le-6 


4.01 


1.7e-6 


4.00 


8.6e-7 


3.99 


2.8e-7 


4.02 




4 


7.1e-8 


4.00 


l.le-7 


4.00 


5.4e-8 


4.00 


1.7e-8 


4.01 




2 


6.9e-7 


4.94 


2.3e-7 


5.05 


5.3e-7 


4.96 


1.7e-7 


4.89 


4 


3 


2.2e-8 


4.98 


7.1e-9 


5.02 


1.7e-8 


4.99 


5.3e-9 


4.97 




4 


6.9e-10 


4.99 


2.2e-10 


5.01 


5.3e-10 


5.00 


1.7e-10 


4.99 




2 


2.3e-8 


6.01 


7.1e-9 


5.91 


1.7e-8 


5.97 


5.7e-9 


6.04 


5 


3 


3.6e-10 


6.00 


l.le-10 


5.98 


2.7e-10 


5.99 


8.9e-ll 


6.01 




4 


5.6e-12 


6.00 


1.8e-12 


5.99 


4.3e-12 


6.00 


1.4e-12 


6.01 



7 Conclusion 

We devised a family of DG methods for the Timoshenko beam problem and 
provided sufficient conditions for the existence and uniqueness of its solution. 
We then displayed numerical results showing that the method converges to 
the exact solution with optimal order even in the case of discontinuities of the 
material and the load were presented. We also provided numerical results indi- 
cating exponential convergence. In a forthcoming paper, we provide a complete 
error analysis of the methods and show that they are free from shear locking. 
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Numerical Algorithms for Solving 
Elliptic— Parabolic Problems 
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Summary. This paper deals with numerical algorithms for solving elliptic-parabolic 
problems. An example of such problem is given by the Richards equation for modeling 
the saturated-unsaturated water flow in porous media. We consider a linear model 
problem and investigate the convergence of two flnite-volume schemes. The first 
one uses the implicit approximation in the whole domain, and the second scheme is 
constructed using the splitting method. Results of numerical experiments are also 
given. 



1 Introduction 

In (x,t) e f2i X (0,T] we solve a problem, describing the two-phase flow in 
a porous layered media (for a more detailed description we refer to the book 
of Helmig [Hel97]): 

sipi-^ = V . {Xi{si)KiV{pi{si) - xi)), 

^ pi{x,t) (x,t) e OQd X [0,T], 

V(p/ - xi) — 0, (x, t) G dQ\dQD X [0, T], 

^ Sl(x,0) = Sinit(x), 

where 5/ is the water saturation in the l-th layer, pi is the pressure, A/(s/)JT/, 
pi are the permeability, porosity of the porous media, and density of the 
fluid, respectively. 

The equation becomes elliptic in the region of saturation, where pi > 0 and 

Si * 

V-iXiist)KiV{pi-x,))=0, 

here the pressure pi is the primary unknown, and sf denotes the water content 
of a water-saturated medium. 

The actual determination of the discrete solution may require large com- 
putational resources due to the strong nonlinearity of the problem, the domi- 
nance of the convective process, discontinuity of the porous medium properties. 
The formulated problem looks like a system of parabolic partial differential 
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equations, but its type can become either nonlinear hyperbolic or degenerate 
parabolic, depending on the influence of capillary pressure (see, e.g. Helmig 
[Hel97], Eymard et al [EGHOO]). 

An additional difficulty arises due to degeneracy of the parabolic problem 
in the unsaturated region to the elliptic problem in the saturation region of 
the porous medium. Fully implicit discrete finite volume and finite difference 
schemes are usually used for the approximation of the flow equations in the 
whole region (see the papers of Alt and Luckhaus [ALu83], Chen and Ewing 
[ChE97], Eymard et al [EGHOO], Jager and Kacur [JaK91]). 

It is well-known that most effective numerical algorithms for solving multi- 
dimensional parabolic problems are based on splitting methods. New time 
splitting schemes for the time integration of the problems describing the two- 
phase flow in the porous medium are proposed by Ciegis et al. [CP ZOO]. The 
parallel version of this algorithm is considered in [CCZ99]. 

In Section 2 we formulate the model linear 3D elliptic-parabolic initial- 
boundary value problem and define its approximation in space by the finite- 
volume method. In Section 3 we formulate and investigate the fully implicit 
scheme for integration in time. The additive integration scheme, which is based 
on the Douglas algorithm, is presented and investigated in Section 4. Finally, 
the results of numerical experiments are presented in Section 5. 



2 Problem Formulation 

We consider a linear elliptic-parabolic initial-boundary value problem, which 
is defined in the region Q = [0, 1]^ x [0, T]: 

' w = ,1 ^ £7) - 

3 d / du \ 

u(X,i)=^0, (X,i) [0,T], 

,«(X,0)=«o(X), X€[0,1]3. 

We assume that: 

Q = Qparl^Qelh Qell = [<^ 5 ^]^ X 

Let Qhr = Qh X Qr be the discrete uniform mesh: 

Qh ^ {Xijk ^ {xii,X 2 j,xsk) ‘ xim=^mh, 0<m<M}, 

: t^ = nr^ 0 < n < N } . 
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We use the notation = U{Xijk,t'^). Using the finite- difference method we 
approximate a part of the differential operator (1) by the following discrete 
operators 

AjU^ = {kjUSj^^-q^inU^, AjU^ = AjU^ + fj{X,n, i = 1,2,3, 
here the discrete difference operators are defined by 

^ un^x^ + h)-u- ^ u- - U-{xj - h) 

h ' h 

The selection of a time-stepping procedure is a non trivial task, since the 
stability and robustness of the algorithm on the one hand must be balances 
with the computational efficiency on the other hand. In the following section 
we investigate two different numerical integration schemes. 



3 Fully Implicit Difference Scheme 



We approximate the differential problem by the following modification of the 
backward Euler scheme: 



rrn+l _ Tin 3 

c(X} " ^ 

j=l 



c(X) = 



1 if X eQ, 



par-i 



0 if X G Qell- 



( 2 ) 



At each time level t'^ we get a system of linear equations: 

(7 1 - E ^1) + E ’ 

j=i 

or 

A[/^+i ^ . (3) 

We note that even if the problem is parabolic in the whole region of its 
definition (i.e., the flow in the porous media is unsaturated) the 3D parabolic 
problem is approximated by the backward Euler scheme. 

The system of linear equations (3) is solved by some iterative method, 
e.g. the Conjugate Gradient method. In the case of the nonlinear two-phase 
flow problem the matrix of the obtained system is non-symmetric, thus some 
special methods such as GMRES should be used to solve a system of linearized 
equations. 

The convergence rate of iterative methods depends essentially on the stiff- 
ness of the matrix A and on the distribution of its eigenvalues. It is easy to 
prove that the following spectral estimates are valid: 
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^ meas(Qpar) 






I< A < 



/ meas{Qpar) , , 

I 1- Ayi,r 



I, 



here are the smallest and largest eigenvalues of the matrix A, 

respectively, and meas(Q) denotes the volume of Q. 

Now we can investigate the stiffness number of the matrix A: 

_ meas{Qpar)/T + 

-^A,max 

meas{Qpar)/T + Aa ,min 

Taking into account that A a, min = ^(1) A^^max == 0{h~'^), and assuming 
that meas{Qpar) > 0, we obtain the following asymptotic estimates: 



k(A) = 



c 

10d€-Cis(^(^ 

c 

meas{Qpar) h 
c 

meas{Qpar) 



if r = 0{y/h ) , 
if r = 0{h ) , 
if T = 0{h ^) . 



(4) 



Thus for sufficiently large time steps r the number of iterations of the CG 
method will be approximately the same as for solving the pure elliptic problem. 



4 Finite Difference Scheme 2 



We approximate the differential problem by the following modification of the 
stability-correction scheme 



U\ 



1+1/3 



- 



u\ 



T 

■n+2/3 



AiU\ 



n+l/3 



+ A2U^ + AzU^, 



-u. 



n+l/3 



= AaC/. 



i+2/3 



• AoC/r 



(5) 



^n+1 _ jjn+2/3 



: - AsC/r 



Here s denotes the elliptic iteration number, and the initial condition at time 
level is recalculated after each iteration as 



rpn — 
^s+i — 



jU^ 'd X £ Oh, par, 

[ [/”+! if Xe Oh, ell , 



U^ = U^, iov {x,tn ^ Qhr . 

The proposed algorithm coincides with the Douglas splitting method if the 
problem (1) is parabolic in the whole region of definition. The stability analysis 
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of the Douglas method is done by Hundsdorfer [Hun96] . It is interesting to note 
that the Douglas splitting method is unconditionally stable if all operators Aj 
have negative real parts of all eigenvalues and only one operator can have 
complex valued eigenvalues (see, also [CPZOO]). 



Stability analysis of SCS 



In this section we will investigate the stability of the proposed iterative algo- 
rithm (5). Let Ai,A 2 ,As be eigenvalues of the discrete operators 
respectively: 

Aj <0, j = 1,2,3. 

Let us denote the error of the iterative solution by 



^n+l 



f 0, if 2: G ^h,par ^ 

if XG Oh, ell. 



Theorem 1. The stability-correction scheme is unconditionally stable for the 
three dimensional linear elliptic-parabolic problem and the following conver- 
gence estimate is valid: 



ii^r+'ii<9 ii^r+i'ii, 9<i. 

Proof Let us consider the Fourier series: 

i,j,m 

where ^i,j,m are the stability functions, or the growing factors 

_ 1 +t^(Ai A24-A1A34-A2A3) — T^AiA2A3 
(1 - rAi)(l - tA 2)(1 - rAa) 

Thus for Aj < 0 the inequalities 



^ Ql ^ f 



hold unconditionally. Recall, that we have the following initial condition: 



{ 



0, 



if X G Oh, par ^ 

if X G Oh, ell. 



Hence \\Z^\\ can be bounded as follows 



m\<q2\\Z^^l 



92 < 1 , 



which implies that 

< 91 pni < 9192 II . 

This completes the proof. 
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5 Numerical Experiments 

In this section we present some results of numerical simulations for problem 
(1) with the following discrete operators 

= (C” )*, + n. i = 1, 2, 3 . 

Two cases of elliptic regions are investigated: 

= [0.4, 0.7] X [0.4, 0.7] X [0.4, 0.7], 

Ql^^ = [0.4, 0.5] X [0.4, 0.5] X [0.4, 0.5] . 

The function f{x,t) is defined using an exact solution of the differential prob- 
lem 

u{X, t) = exp{t) sin(7TXi) sin(7rxj) sin(7rxA;) . 

A stopping condition of the iterative algorithm is defined as: 

j = l 

In Table 1 we present the averaged numbers of iterations at one time level, 
obtained by the CG method applied for the realization of the fully implicit 
finite-difference scheme (2) with T ==0.5. 



Table 1. The averaged numbers of iterations for the CG method 



T 


N = 


: 20 


N = 


:40 


N = 


: 80 


Qlii 


Qlii 


Qlii 


Qiii 


Qlu 


Qlii 


0.1 


21.8 


7.8 


48.8 


20.8 


102.8 


53.8 


0.05 


19.8 


7.0 


44.1 


18.9 


96.1 


49.9 


0.025 


17.6 


6.15 


38.3 


16.9 


83.5 


43.7 


0.0125 


14.95 


5.02 


32.32 


14.65 


68.45 


37.0 



In Table 2 we present the averaged numbers of iterations at one time level, 
obtained by the splitting scheme (5). Here tq denotes the value of parameter 
r used in the elliptic region Qhi- 



6 Conclusions 

In this paper we have discussed two numerical algorithms for solving a three- 
dimensional elliptic-parabolic problem. The main difference between these 
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Table 2. The averaged number of iterations for one time step 



T 


II 

to 

o 


To 


II 

O 


To 


0.1 


15.8 


0.005 


26.2 


0.002 


0.05 


14.35 


0.006 


26.5 


0.002 


0.025 


12.85 


0.007 


24.3 


0.0025 



methods is that in the first one the differential problem is treated as an elliptic 
problem in the whole region of the definition and it is integrated in time by 
the backward Euler scheme, whereas the second method treats the problem as 
parabolic and the integration is done by the splitting-type method. The ad- 
vantage of the latter approach is that the linear algebra algorithm is reduced 
to simple one-dimensional subproblems. The advantage of the first method is 
that the fully implicit approximation leads to a very robust algorithm. 
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Summary. We provide an example of a stochastic approach to relaxation of the 
variational integrals with non- attainable infima in one dimension. We provide an 
approximation for the coefficients of the Laplace transformation of the Probability 
Density Function. This approaximation yields the relaxing microstructures. 



1 Variational Formulation of Non-attainable Differential 
Inclusions 

We have reported in [6] application of the Subgrid Projection Method to the 
problem of finding an approximation to solutions of non-attainable differen- 
tial inclusions. In this contribution, we describe how this approach leads to 
a stochastic variational formulation of this problem. Consequently, we can ap- 
proach solutions to such problems by stochastic gradient flows. We recall that 
we consider the following 

Problem 1.1 Let f G 1), \ f'{x)\ < I for a. a. x e [0,1]. Find a func- 
tion u G 1) such that 

u\x) G {il}, for almost all x G (0, 1), and 
u{x) = /(x), for all X G (0, 1). 

□ 

This problem cannot be solved in 1), much less in but if we 

relax any of the two contradictory requirements in (1.1) a bit, the set of the 
solutions is enormous. In fact it is dense in the sense of Baire category, [8]. 
Namely, for any e > 0 there exists G 1) such that 

u[ G {±1}, a. e. in (0, 1), and 

ll'^e — /I1 l°°(0,1) 

Moreover, for any continuous function h = h{x) such that h{u'^) — ^ p, weakly 
in 1/^(0, 1), as e ^ 0+, we have 
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h{y)dfi:,^u'Sy)^ 



(2) 



where fix,u'^ — A(x)^_i + (1 — \{x))6^i] denotes the Dirac measures on 
R giving unit mass to the points ±1; and X{x) = ^{1 — f'{x)) a. e. in [0, 1]. 
A functional with non- attainable infimum compatible with Problem 1.1 is the 
popular potential 




i \u{x)-f{x)f dx. 



( 3 ) 



Of course, the choice of this particular form is somewhat arbitrary. It is obvious 
that 



inf /(Mh)>0 ( 4 ) 

for any conforming approximation Vh of Hence it follows from (1) 

that the set of local minimizers of (4) is large. In particular it is shown in [4] 
that if Eh = inf \4 I{uh) then there exists a family Kh consisting of (l//i)^/^ 
local minimizers of the same discretized problem such that 

I{vh) < (1 + 24Vh)Eh, for any Vh G Kh, 
sup I{tvl + (1 - t)vl) > ^Eh, v\,vl e Kh- 

tG[0,l] 

Moreover, we have shown in [7] that if we consider a problem similar to Prob- 
lem 1.1, corresponding to tetragonal-to-cubic phase transformations, and if we 
apply any Descent Algorithm with a pseudo-gradient as descent navigation 
then such an algorithm converges strongly regardless of the initial guess (that 
is a weakly differentiable function) even if the infimum cannot be attained! 
The situation is even more complicated by the fact that this class of problems 
suffers from the so-called Lavrentiev phenomenon which makes the minimiz- 
ers dependent on the choice of the functional space, [3]. In summary: classical 
minimization algorithms applied to (3)-like potentials with non- attainable in- 
fima fail. There exists a huge number of local minimizers on the discrete level, 
computationally the minimizers will depend on the polynomial approximation, 
the large energy barriers makes the solution dependent on the initial guess. 
We conclude that, in the light of these difficulties, it is hopeless to expect 
a reasonable outcome by including into the variational formulation just the 
information about the averaged states and the crystallographic constraints for 
the derivatives. We have shown in [5] and [7] that the variational formulation 
should have a form 

J' ^^macro 4 “ "^^micro 4 “ stochastic' (^) 

Our conclusion is motivated by the observation that we can achieve very good 
numerical results by building minimizing sequences which become asymp- 
totically (weak) white noise in their derivatives, c.f. [5], [7]. Note that such 
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sequences cannot be periodic! Hence, we conclude that the appropriate mi- 
croscale dynamical system for approaching the solutions of (1.1) ought to be 
given by a Langevin dynamics, i.e. by a stochastic gradient flow. Assuming 
that the Helmholtz free energy has a form (6), then we obtain the Langevin 
system as an Euler-Lagrange equation corresponding to this functional. We 
present such an approach in this paper. We refer to [5], [6] and [7] for two 
possible constructions of WstochasUc- 



2 Two Microscale Finite Dimensional Langevin Models 

We consider two models: one with a fixed potential and one with a time varying 
potential. Our objective is to find appropriate local minima of the potential 

/(it) — lE^Q,cro(^) (^) 5 (^) 

where 

Wmacro{u) = [ \f{x)-u{x)fdx (8) 

Jo 

Wmicro(u) = / \u'{x)‘^ - l\^ dx. (9) 

Jo 

As A oo, the minimizer of I will converge to the solution of the constrained 
minimization problem 

minimize Wmacro{u) subject to \u'\ — 1. 

We will generally consider piecewise affine approximate solutions to this min- 
imization problem. Thus 



( 10 ) 

i=l 



where v = (vi , . . . , v„), (^a(x) 



ri if 
\0 if 



is the characteristic function 



if X e A, 
if X ^ A, 

of A and A{ = [(z — l)/n,z/n). For a given vector v of values of u' on the 
intervals A^, we will recover u via the formula 



u{x) 



nX pi 

= / Y]vi(t)Ai{y)dy + 

Jo Jo 



f(y) - 



i=l 



dy. (11) 



With this convention, we may write u — Hv where H is the linear operator 
implicit in (11) and think of / as a function of v rather than u, say 



I{v) - I{Hu). 
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2.1 Time Independent Energy Density 

Our first approach is to consider a Stochastic Differential Equation (SDE) for 
V of the Langevin form 



dv(t) — — V/(u) dt + dry(t), (12) 

v{0) = 0, 

where r] is an n-dimensional Wiener process satisfying 

E[r]{t)] = 0, E[r]{t) ^ T]{s)] = Amin{s,t}. 

The matrix A is a given n x n positive definite matrix, usually taken as an 
identity matrix. Note that drj{t) is a white noise process. In other words, we 
assume that the energy corresponding to Problem 1.1 has the form (6), and 
(12) is the underlying stochastic gradient flow enforcing the competition in the 
weak topology to provide the bridge between the atomic nano-scale and the 
specimens microscale (~ fim). 

The choice of the initial condition is somewhat arbitrary. We will have more 
to say on this point in our second approach. This SDE describes a particle 
moving under a force (defined by the gradient of the potential) subject to 
“thermal” agitation (defined by the white noise term). The resulting stochastic 
process is a diffusion. 

For large values of A, the behavior of the diffusion process defined by (12) 
is wander for a short period of time until it “falls” into one of the “wells” 
corresponding to the local minima of Wmicro- Our Monte Carlo experiments 
indicate that there is a slight preference for the particle to be initially attracted 
to the deeper wells (which correspond to smaller local minima of W because the 
corresponding u is a. better approximation to / in that it makes the Wmacro 
term smaller), but this preference is very weak. For this reason, a second 
approach was considered, which we believe better reflects the physical process 
of the material phase changes we are modeling. 

2.2 Time Dependent Energy Density 

Consider a time dependent potential 

I{u,t) = [1 - a{t)]Wmacro{u) + 0:{t)XWmicro{u)- ( 13 ) 

The SDE remains of the same Langevin form. Namely, 

dv{t) = —VI{v,t)dt-\-drj{t). (14) 

Here, the gradient is taken only with respect to the first variable v. The strategy 
for specifying a is to set 



a{t) = 0, 0 < t < ti. 



(15) 
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where is chosen large enough that v{t) has converged to a steady state 
solution of the Langevin equation similar to (13) but containing only Wmacro- 
Then, a{t) is increased linearly to its maximum value of 1 

a{t) = max{a{t -ti),amax}, (16) 

until the particle drops into one of the wells of Wmicro- 

The actual implementation requires selection of a time interval At and then 
we approximate the SDE with the Stochastic Difference Equation (SZ\E): 

v{{k + l)At)-v{kAt) = -VW{v{kAt),kAt)At + Zk+iAt^^'^, (17) 

where Zi, Z 2 , . . . are independent and identically distributed Gaussian random 
vectors having mean 0 and covariance A. For given Wmacro and Wmicro the 
simulation requires the following inputs: 

1. A which controls the weight given to the Wmicro term in the potential. 

2. A which is the covariance matrix for the noise; we have taken this to be 
a multiple of the identity A = cr^/, so only specification of is necessary. 

3. ti which is the “burn in time” so that the process V is in a steady state 
determined by Wmacro’ This requires some experimentation depending on 
/• 

4. a which controls how quickly the potential transforms from Wmacro to 
WVmicro- This was controlled by choosing the maximum number of time 
steps and choosing a so that amax was achieved at the maximum number 
of time steps. 

6. cxmax, the maximum change in a. 

6. At, the time step. 

The actual numerical values we used are given in Table 1. 

3 Meso-scale Fokker-Planck Equation 

The Probability Density Function (PDF) g : M't x 1 — > R can be used 

to obtain any statistical information contained in the microscale Langevin 
systems for v at a fixed time point t. Namely, 

E[h(t,v)] = J h{t,v)g{t^v) dv, for any h G (R“^, R^) . (18) 

The PDF g is obtained by solving the meso-scale deterministic Fokker-Planck 
equation, [10]. The Fokker-Planck equation for the Langevin system in Section 
2.1 has the form 

= -divt, [DW density {v)g{t,v)\ + yAv5(i,v), V eR^,t>0, (19) 
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where is the white noise standard deviation. Hence, for our specific form of 
the energy density given by (7), the Fokker-Planck Equation (19) becomes 



dg{t,v) 

dt 



j f{x)l3{x) dx) • Vg{t, - (y /^(^) ® *i^) ^ v) 



+ ^ E ~ ^ g{t, v) + y A 5 (i, v), 

( 20 ) 



where 



f{x) = f{x) - [ f{y) dy, 

Jo 

= / Pi{x)Pi{x) da;, 

Jo 

_ n-»+i/2 ^ a;G (0, (i-l)/n), 

x-{i- l)/n - a; G [(i - l)/n, i/n), 



Bij 



Pi{x) 



1/n ■ 



_ i-l/2 



, X G (z/^5 !)• 



Since the microscale process given by the Langevin equation is a diffusion, we 
expect, in the sense of the convergence in measure, 

2n 2^ 

lim lim lim y(t, v) = Vyi<5v*(v) := y(v), Vq'i = 1, y* > 0, 

cr2-^0-|_ A— >+oo t— ^+oo ^ ^ ' 

^ i=\ i=l 

( 21 ) 



where represent all 2^ states with v*^- G {il}. Note that 

1. t — > +00 corresponds to finding the equilibrium distribution of the states, 
which is given by the Gibbs distribution [9] , 

2. A ^ +00 represents imposing the ±1 constraint, 

3. 0+ corresponds to cooling, 

4. n — ^ oo gives the continuum case. 

The Laplace transform of g{v) has the form 
[Lg){y) = 

1 = 1 



(22) 
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where denotes one of the 2'^ possible distributions of ±1, having the prob- 
ability Qj. Our goal now is to compute the coefficients qj in (22). In principle, 
we have two options: 

1. Take the Laplace transform of the Fokker-Planck equation (20) with the 
aim to obtain a dense linear 2'^ x 2’^-system for the unknown coordinates 
qj. The system can be obtained by inserting 2'^ various vectors y into 
(20) together with the representation (22). One may then use a reduction 
technique, [2], [1], to obtain a sparse system and solve it. 

2. Enumerate all the 2'^ states determine their probabilities qj, and pick 
the state with highest probability as the most likely one to which the 
system will relax. 

Here, we show only the second approach which is based on the following con- 
jecture. 

Conjecture 3.1 

qi — g'S — g)^“ 2 where, 

^■=1 (23) 

1 A _ / (j7») - / (O' - 1)/») 

2 \ 1/n 

This conjecture has the important application that it can be used to compute 
directly the volume fraction. We believe that it is probably not exact, but in 
fact provides an excellent approximation. It is in part based on our previous 
work [5] where we found that the computed approximate solutions of (1.1) had 
a white noise property. Since 

f\x) = [ ydy:,{y), (24) 

Jr 

where /j^x = a{x)S-i + (1 — a{x))S^i, a{x) = ^ (1 — • Then if we let 

dj - 2^ Qi, (25) 

v*,=-i 

and if a„ were defined by a„ € C''’(0, 1) such that a„{xj) = aj, then 

an —* a, as n — > oo in ^^(0, 1). (26) 



4 Monte Carlo Simulations Based on the Langevin 
Equation 

We use the model (17) and we apply Monte Carlo simulations to obtain se- 
quences which come close to approximating Problem 1.1. The target function 
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for this simulation is f{x) = x{l — x) with f{x)dx = 1/16. The volume 
fraction is computed by averaging over the replications. Namely, 

= 2 

Here, m is the number of replications and Vi is the vector of the derivatives for 
the 2 — th replication. The index j refers to the interval Aj^ i.e., [{j — l)/n, j/n], 
where n is the number of intervals in (0, 1). The result shown in this section 
is based on twenty independent simulations for the mesh with h — 1/200. 
To deduce the macroscopic shape, we average over all twenty replications, c.f. 
Figure 1. 



Table 1. Data for the model (17) used in this section 



Variable Value 

spatial resolution ^ 

number of independent replications 20 
max no. of steps per replication 10^ 
no. of steps for burn in 10^ 

time increment per step 0.05 

standard deviation for white noise 0.02 
maximum change in alpha 0.5 



Simulated 1^ Finite Element Approximations Exact Shape and Averaged Marcoscopic Shape 





Fig. 1. Macroscopic approximations. The left picture shows all twenty replications 
(each replication has different color) and the averaged state (thick smoother line). 
The right picture shows the difference between the target function f = x{l — x) and 
its computed shape. The spatial resolution is h = 1/200. 
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Fig. 2. Volume fractions. The volume fraction on the left is computed with a = 0.02. 
The volume fraction on the right is computed with the data given by Table 1 but 
with much smaller deviation. Namely, with cr == 0.005. This shows that particles need 
a higher values of the deviation to discover states with lower energy. 



5 Simulations Based on the Analysis of the 
Fokker-Planck Equation 

We use in this section the formulae and convergence results to investigate the 
approximation properties based on the formulae (23) and (25). We chose the 
spatial resolution to he h = 1/16. With 16 elements on the (0,1) segment 
we have 2^^ = 65536 coefficients in the formula (23) and each state vector 
has 16 components. We evaluate the values of qi and we select the state for 
which Qi is the biggest. In other words we select the state with the highest 
probability to exhibit the microscopic structure of the solution. The volume 
fraction is computed using the formula (25). It seems that the discrete solution 
corresponding to the maximum probability state minimizes the distance 
to the target function. We chose f{x) — ^ sin(13a:) for the calculations in this 
section. The results are plotted in Figure 3. 
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Summary. Today the finite element method is known as a powerful tool capa- 
ble of solving complex fiow in complex geometries. Additionally, the unstructured 
grid topology is a complementary tool which effectively increases computational ef- 
ficiencies. On the other hand, the finite element volume methods incorporate the 
advantages of conserving the conservative quantities within elements. However, the 
accurate conservation statements need utilizing suitable approximation at cell faces. 
In convection dominated fiows, upwind-based schemes are strongly utilized. How- 
ever, these schemes do not suffice to incorporate the details of pressure field in the 
approximation. Therefore, the pressure- weighted upwind scheme is a better choice 
for a flow field with high pressure gradients. In this work, a pressure- weighted up- 
wind scheme is suitably extended for solving incompressible flow on unstructured 
grids. Subsequently, a remedy is given for the problem associated with using equal- 
order pressure and velocity interpolations. Eventually, the extended formulations are 
validated against suitable benchmark problems involving small and large scale re- 
circulation zones. Comparing with the benchmark solutions, the current results are 
excellent. 



1 Introduction 

The rapid progress in unstructured grid generation techniques has encouraged 
the CFD code developers to extend their formulations in order to solve the fluid 
flow and heat transfer problems on unstructured grid distributions [1]. In fact, 
computational methods always require to improve their accuracies in solving 
more complex realistic configurations, of course, at lower computational cost. 
The limit in computer memory storage enforces the computational methods 
to use limited number of nodes. This limit can in turn degrade the achieved 
accuracy. The unstructured grid can be used as a powerful tool to compensate 
the degraded accuracy by suitable grid clustering within the zones with high 
flow field gradients. 

Basically, there are two potential numerical instabilities associated with 
Galerkin-based methods [2]. The first instability is spurious oscillation in the 
flow field due to the presence of advection terms. The second instability is 
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due to using inappropriate combinations of velocity and pressure interpola- 
tions. Generally speaking, the standard Galerkin-based formulations need to 
be modified when convection is the dominant physics. The Petrov- Galer kin 
[3], Taylor- Galer kin [4], Galerkin/Least-squares [5] ... are a number of effec- 
tive remedies to resolve the problem. 

Although the advantage of using unstructured grid in finite element volume 
methods is many [6, 7], the instability issues still persist. In fact, the key point 
is in the correct approximation of the cell face velocities which is still chal- 
lenging. Normally, the upwind-biased interpolations do not take into account 
the weight of pressure- field in flow acceleration/deceleration. Therefore, the 
pressure- weighted upwinding scheme is used as an important alternative [6]. 
Unfortunately, this scheme cannot be taken as a solution to the consequences 
of using equal-order interpolations [8]. In this work, a physical-based pressure- 
weighted upwind scheme is introduced and suitably extended for using in ap- 
propriate combinations of velocity and pressure interpolations. Meantime, a 
robust prescription for accommodating equal-order interpolations is given. 



2 Governing Equations 

In the present study, we are concerned with the two-dimensional incompressible 
steady flow. The governing equations consist of the conservation statements 
for mass and moment urns. The non-dimensional vector form of the governing 
equations is given by 

V*-V*=0 (1) 

Re [V* • (V*V*) + W] = V* V* (2) 

where the lengths x Szy, velocity U, and pressure p variables are nondimension- 
alized with respect to a characteristic length Loo (e.g., x*=x/Lqo, y"^=y /L oo)-, 
a reference velocity Uqo (e.g., V*=V/Uoo), and a reference density poo (e.g., 
P*=p/{PocV^))- 




Fig. 1. A part of unstructured grid representing the elements and a constructed cell 

The solution domain is broken into a huge number of triangular elements 
which are distributed in an unstructured form. The elements fully cover the 
solution domain with no overlapping. Figure 1 shows a small part of the solu- 
tion domain. Nodes are located at the triangle vertices. They are the locations 
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of the unknown variables. There are three main neighbors around each trian- 
gle, see element ABC in Fig. 1. There are no limits in the number of elements 
intersect at a node. For example, the shaded area in Fig. 1 shows six triangles 
which encompass node P. Therefore, to utilize the benefits of cell-centered 
schemes, each element is divided into three quadrilaterals by the help of its 
three medians. The medians are demonstrated by dashlines in Fig. 1. The cells 
are then constructed from the proper assemblage of these sub-quadrilaterals. 
As is seen, irrespective of the shape and distribution of the elements, each 
node is surrounded by a number of sub-quadrilaterals. The proper assemblage 
of neighboring sub-quadrilaterals around any non-boundary node creates a 
polygon cell. In case of unstructured grid, it is possible to have a hybrid mesh 
composed of polygons having different number of sides. It is because the num- 
ber of elements which visit an specific node is not fixed. 



3 Computational Modelling 

To utilize the advantages of finite element volume methods, the governing 
equations are initially integrated over the shaded area or the cell shown in 
Fig. 1. The employment of Gauss divergence theorem to the dimensional form 
of the governing equations leads to 



/ VdA = 0 




(3) 


Ja 




[ u{pV) ■ dA = - f p dAx + 


/ {jj,Vu) • dA 


(4) 


A Ja 


JA 




f v{pY) ■dA = - [ pdAy + 
'a Ja 


[ (pVu) • dA 

JA 


(5) 



where V = + uj, p, p, and p represent velocity, pressure, density, and the 

molecular viscosity, respectively. The above integrals are evaluated over the 
surface which encloses each cell. The cell area is indicated by A. The above 
equations are suitably discretized using finite difference scheme and finite ele- 
ment interpolations. In the above expressions, dA=dAxi — dAy]=Ayi — Ax] 
is calculated on each cell face. Using this definition, the above integrals can be 
evaluated by summation over the faces that enclose the cell, i.e., 

ns 

'Y^[p{u dAx + V dAy)]i = 0 ( 6 ) 

i=l 



ns ns ns 

(u. dAx V dAy^^i = ^ ^ (p dAx^i 4" ^ ^ 

i=l i=l 



du 

dx 



dAx + 



du 

dy 



dAy 



(7) 
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Fig. 2. The velocity upwinding strategy within an element 



{u dAx + V dAy)]i = - dAy)i + ^ 

i=l i=l i=l 

(8) 

where i counts the number of cell faces from 1 to ns. The number of cell faces 
around node P in Fig. 1 is 12. The bar over u and v indicates that the variables 
are approximated from the known magnitudes of the preceding iteration. These 
estimations are necessary in order to linearize the nonlinear convection terms. 
The rest of procedure is to relate the cell face magnitudes (identified by lower 
case letters such as n, u, and p variables) directly to the nodal magnitudes 
(identified by upper case letters such as C/, U, and P variables) where the 
unknown variables are located. A simple idea for treating the right-hand-side 
terms is the use of finite element shape functions i.e., 



fdv 1 A 

M dAx T 



Pi = 






i=i 



( 9 ) 



dz dz ^ 



(10) 



where pi identifies the magnitude of p at the mid-point of ith cell face. The j 
notation counts the node numbers of an element where the ith. cell face is lo- 
cated inside it. Additionally, the variable z represents either x or y coordinates 
and (j) represents either u or v velocity components. In the above expressions, 
lower and upper case letters represent cell face and nodal magnitudes, respec- 
tively. 

The above treatments end the pressure and diffusion term calculations at 
cell face i. However, more sophisticated expressions are required to treat the 
convection terms. In fact, the treatment should not disregard the convection- 
diffusion physics and concept. To mimic the correct physics of the convection. 
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the convection term in the left-hand-side is upwinded. Considering the zth cell 
face in Fig. 2, one inclusive suggestion is given by 



ASki ( 11 ) 

which has been written in the streamwise direction at mid-point of the ith cell 
face. The length AS is a geometry sensitive parameter shown in Fig. 2 as s. 
Then, we need to determine the gradient of cj) along the streamline. We try 
to approximate this gradient using the original governing PDF’s. In another 
words, one meaningful approximation can be obtained by writing the revised 
momentum equations in the streamwise direction, i.e., 

= V • mV<A + 5^ (12) 

where V = -h is the total velocity at the cell mid-point and the source 
term represents either dpfdx in treating x-momentum or dpjdy in treating 
y-momentum. The substitution of Eq.(12) in Eq.(ll) results in 



+ 






ASki 



(13) 



As is observed, the influence of pressure has been considered in calculating the 
correction part of Eq.(ll) now. Using the flnite-element context, this statement 
can be revised to 



j=l ^ ' 



1 / Yl]=i - (pi 






3 

■E 

j=i 



dNj, 

dz 



Pj ASki (14) 



where Li is an appropriate diffusion length scale [6]. This length can be esti- 
mated in an specifled triangle by discretizing the diffusion terms using central 
differencing. 

Equation (14) shows that (pi appears in both sides of equation. As is seen, 
considering a lagged role for p in the diffusion term results in a passive role 
of diffusion term in the formulations. To switch it to an active role, it is not 
lagged and the impact of this term is taken to the left- hand- side of Eq.(14). A 
suitable rearrangement of the new equation in terms of our major dependent 
variables, i.e., L>j and Pj^ yields 

3 3 

pi = ^ij^j + PijPj Pli ( 1 ^) 

3=1 3=1 

where a,/3, and 7 represent matrix, matrix, and vector coefficients, respec- 
tively. The above statement indicates that p {= u^v) at cell face can be ap- 
proximated by the proper assemblage of ^ and P inffuences. In fact, this 
approximation can be regarded as a pressure- weighted upwind scheme. 
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As is known, one major disaster with the continuity equation is the lack 
of having any explicit pressure term despite representing the pressure field. 
The past investigation has shown that the ignorance of this important point 
can result in non-physical wavy solution [9]. The most important reason for 
this non-physical wavy solution is the employment of equal order pressure and 
velocity interpolations in Eqs. (6)- (8). The use of unequal interpolations is 
known as a general remedy to suppress the non-physical solution. However, 
the current innovate idea suggests the use of more sophisticated interpolations 
such as Eq.(15) which enforces the direct role of pressure field in the continuity 
equation. Therefore, using Eq.(15) in the continuity equation may eliminate the 
need for unequal order interpolations. In another words, Eq.(3)should no longer 
permit the occurrence of a non-physical solution in the domain. Although this 
strategy theoretically seems to work well, some deficits has been practically 
encountered. For example, defining m = p{u dAx + v dAy), Eqs.(6)-(8) for 
Euler flow can be re-written as 



i=l 



i=l 



== 0 


(16) 


i=l 




ns 

rriiUi = - '^{p dAx)i 


(17) 


i=l 




ns 

rUiVi = - '^{p dAy)i 
i=l 


(18) 



In a one-dimensional context. Reference [9] shows that the above equa- 
tions can still result in undesirable wavy solution under special circumstances. 
However, the wavy solution can be eliminated if Eq.(14) is suitably modified. 
Therefore, instead of using unequal order interpolation functions, we introduce 
and utilize new (/>’s at cell faces. Considering the generic form of upwinding 
given by Eq.(ll), the new velocity gradients are given by 



= V • 



,,du dv. 



(19) 



where the new additional gradient terms in the parentheses provide new state- 
ments for the velocity components. Considering the new definitions, Eq.(13) 
is revised to 



+ 




V • /iV0 + S(f) — (f) 



^ du dv\ 
\aa: ''' dy) 



^Ski 



( 20 ) 



The new additional gradient terms are approximated using the approach em- 
ployed in Eq.(lO). The rest of equation is treated similar to Eq.(14). The tilde 
over 0 indicates that these velocity statements are used in the continuity equa- 
tion. 
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4 Results and Discussion 

The derived formulations are tested using standard squared cavity [10] and 
triangular cavity [11] benchmark problems which involve many complex re- 
circulation zones. The first problem is the flow in a squared cavity driven by 
its upper lid. The problem is tested in Re— 3200. Figure 3 (left) demonstrates 
a typical non-uniform unstructured grid distribution in the cavity. The grid 
generator has been developed by the current authors. It is capable of generat- 
ing different types of unstructured grid topology. As is observed, the grid has 
been properly refined in regions with high flow field gradients. Figure 3 (right) 
depicts the streamlines in the cavity. The recirculation zones in bottom corners 
and top-left corner resemble the complexity of this flow field. They have been 
detected successfully and accurately. 

To investigate the advantages of using the extended formulations on an 
unstructured grid, the cavity is tested on both uniform and non-uniform un- 
structured grids. The grid resolution is 71x71 for uniform grid. The total 
number of nodes in non-uniform grid is 5508 which is close to that of uniform 
grid, i.e., 5684. The number of cell for them is 10734 and 11086, respectively. 






Fig. 4. The centerline velocities in the cavity and the convergence histories 
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Figure 4 (left) demonstrates the centerline u and v velocities for both uniform 
and non-uniform grid types. The current results are compared with each other 
and those of benchmark [10]. The figure shows that the results of non-uniform 
grid distribution are similar to that of uniform grid. Additionally, they are 
in good agreement with the benchmark solution. Figure 4 (right) shows the 
residual histories for both U velocity component and P fields. The V velocity 
component history is very similar to that of U. The histories are presented for 
both uniform and non-uniform grids. Despite performing equal accuracies in 
Fig. 4(left), the performances are different. As is seen, the convergence insta- 
bilities are dominant in uniform grid. However, the non-uniform unstructured 
grid resembles a smooth and stable residual reduction. 

The second test problem is the steady recirculating viscous flow in an equi- 
lateral triangular cavity. A primary eddy and several secondary eddies at dif- 
ferent regions indicate the complexity of the flow field. Figure 5 shows the 
geometry of the cavity with two typical unstructured grid distributions. The 
top horizontal wall is sliding with a constant velocity. The non-uniform topol- 





Fig. 5. Using uniform and non-uniform unstructured grid topologies in triangular 
cavity 





‘ Fig. 6. The streamlines and iso-bars in triangular cavity 
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ogy suitably clusters the grid around the corners. Of course, the refinement 
helps to achieve more accurate results for an insufficient number of mesh nodes. 

Figure 6 shows the streamline patterns (left) and isobar lines using non- 
uniform grid distribution. The Reynolds number is 500. The total number of 
nodes is 1135. As is observed, there are one primary eddy on top and two 
secondary eddies under it. The eddies shrink in lower levels. Additionally, 
there is one irregular recirculation on the left edge which has been successfully 
detected. Unfortunately, there is no quantitative report of velocity magnitudes 
for this benchmark case. However, a qualitative comparison has been done 
with other references such as Refs. [11, 12]. Table 1 shows that the x and 
y magnitudes of the eddy center locations are in good agreement with those 
obtained from the benchmark solutions. 



Table 1. A comparison on the {x^y) locations of vorticities in triangular cavity 





top vortex 


mid vortex 


bottom vortex 


edge vortex 


Current results 
Guermond [12] 


0.038, 0.639 
0.042, 0.632 


-0.047, 0.298 
-0.049, 0.294 


0.000, 0.064 
0.000, 0.062 


-0.389, 0.706 
-0.395, 0.701 
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Summary. We deal with the numerical solution of the compressible Navier-Stokes 
equations with the aid of the discontinuous Galerkin finite element (DC FE) ap- 
proach with the nonsymmetric interior penalty terms. The linearization of diffusive 
terms and the treatment of the boundary conditions are discussed. Several numerical 
examples demonstrating the efficiency of the numerical method are presented. 



1 Introduction 

We deal with the numerical solution of the compressible Navier-Stokes equa- 
tions with the aid of the discontinuous Galerkin finite element method 
(DGFEM). This method has become quite popular and it is discussed in a num- 
ber of papers. For a review of DG methods, see [4] or [5]. Let us mention the 
papers [1] and [2] dealing with the numerical simulation of compressible flows, 
where the mixed formulation is applied to the treatment of the viscous terms. 

We develop the so-called DGFEM with nonsymmetric interior penalty 
terms. This method was applied to the solution of a scalar nonlinear 
convection-diffusion equation in [8] where a complete numerical analysis is 
presented. The extension of DGFEM to the system of the Navier-Stokes equa- 
tions is straightforward (see preliminary results in [6]) but some suitable lin- 
earization of diffusive terms has to be performed. That is the subject of this 
paper. 

In Section 2 the continuous problem describing compressible flow is formu- 
lated. DGFE discretization is introduced in Section 3, where also the lineariza- 
tion of diffusive terms is discussed. Several numerical examples demonstrating 
the efficiency of the method are presented in Section 4. 



2 Continuous problem 

Let i? C IB? be a bounded plain domain and T > 0. We set Qt = O x (0, T) 
and by dQ we denote the boundary of i? which consists of several disjoint 
parts. We distinguish inlet //, outlet Fq and impermeable walls Fw on dQ. 
We want to find a vector- valued function w : Qt such that 
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dw ^ df,{w) 
dt dxs 

S = 1 



E 



dRs{w, Vt«) 

dXs 



in Qt, 



( 1 ) 



where 



w = w{x,t), X £ n, t £ (0,T), 
w = {p,pvi, . . .,pvN,E)'^ £ 

/i(n^) = {pvi, pviVi + Siip, pv2Vi + 52iP, {E + p)vi)^ , i = l,2 
_{w,'Vw) = (o,Tii,Ti 2 ,TaVi+Ti 2 V 2 + —p^^dO/dxi'j , i = l,2 

To system (1) we add the thermodynamical relations 



( 2 ) 



Ri 



P= -'^){E - p\v\'^/2), e=[^-]-\v\ 

. P ^ 



( 3 ) 



We use the following notation: v = {vi^V 2 )^ - velocity vector, p - density, p 
- pressure, 0 - temperature, E - total energy, 7 - Poisson adiabatic constant. 
Re - Reynolds number, Pr - Prandtl number. 

System (1) is equipped with the initial condition 

0) — X e f?, (4) 



and the following set of boundary conditions on appropriate parts of the bound- 
ary: 



a) p\rix{ 0 ,T) = PD, b) v\rjx{ 0 ,T) =VD = {VD1,VD2)^, (5) 

2/2 \ 

*=) E(E^b”ib/ + ;R^^=0 onr/x(0,T); 

j=l \i=l J 

dO 

a) n|/’jyx(o,T) = 0, b) ^|rvKx(o,r) = 0; (6) 

E)Y^Tijni=0, j = l,...,2, b) ^=0 onEox{0,T); (7) 

i=l 

The problem to solve the compressible Navier-Stokes equations, equipped 
with the above initial and boundary conditions will be denoted by (CFP) 
(compressible flow problem). 




262 V. Dolejsi 

3 DGFE discretization 



3.1 Triangulation 

By Qh we denote a polygonal approximation of the domain Q. Let Th (/i > 0) 
denote a standard triangulation of the closure Qh of the domain Qh into a finite 
number of closed triangles. 

We set Hk = diam(K), h = maxKGTh All elements of Th will be 
numbered so that Th = where I C — {0,1,2 ,...} is a suit- 

able index set. If two elements Ki, Kj G Th have a common edge, we call 
them neighbours and put Tj = dKi fl dKj. For z G / we set s{i) = {j G 
T-)Kj is a neighbour of Ki}. The boundary dQh is formed by a finite num- 
ber of edges of elements K{ adjacent to dQh- We denote all these bound- 
ary faces by Sj, where j G A C Z~ — {-1,-2 ,...} and set 7(2) = {j G 
h]Sj is an edge of Ki},Tij = Sj for Ki e Th such that Sj C dKi^ j G h- 
For Ki not containing any boundary edge Sj we set 7(2) = 0. Obviously, 

5(2) n 7(2) = 0 for all 2 G /. Now, if we write *S'(2) — 5(2) U 7(2), we have 

dKi= U Fij, dKif\dQh= IJ Fij. (8) 

jes(i) 

Moreover, for 2 G /, by 7 d (0 denote the subset of 7(2) formed by such 
indexes j that the faces Tj approximate the parts of dQ, where the Dirichlet 
boundary condition is prescribed at least for one component of w. Then, for 
the Navier- Stokes equations, with respect of (5)- (7) we have 

U U Fij=FiUFw (9) 

Moreover, we set 

7iv(0 = 7(0 \7 d( 0, (10) 

where a Neumann type of boundary condition is prescribed for all components 
of w. 

Furthermore, we use the following notation: riij = {{Tiij)^^ {nij)^) = unit 
outer normal to dKi on the edge Tj {riij is a constant vector on Tj) and 
\Tij \ = length of the edge Tj. Over the triangulation Th we define the broken 
Sobolev space 

Th) = {v, v\k g H'^iK) VK € %}. (11) 

For V £ H\f},Th) we set 

(^^)r„ and (12) 

denoting the average and jump of the traces of v on Tj = Tji, respectively. 
Obviously, {v)r,j = (w)r^,, [u]ry = -Hr,-, and HA,n^ = Ho.nji- 
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3.2 Approximate solution 

We derive the discretization of problems (CFP) with the aid of the DGFEM. 
The approximate solution Wh as well as test functions are elements of the 
finite dimensional space of vector- valued functions 

Sh = [5^]^ (13) 

where 

Sh = S^'-\Q,Th) = {v-v\k € P^{K) WK € %}, (14) 

p G and P^{K) denotes the space of all polynomials on K of degree < p. 

Assuming that in is a classical sufficiently regular solution of problem 
(CFP) and cp G 7^)]^, we multiply equation (1) by integrate over 

Ki E Th, apply Green’s theorem, sum over all Ki E Th and with the aid of (8) 
we arrive at the identity 






i£l jeS{i) 5=1 



E /, E /.(»)■ ^ <1- + E /. E«.(«, v») . ^ d. 



Y^Rs{w,Vw)){nij)s‘[p]dS 



YY ^i^.(^,Vln)(n,,■),.(^d5 = 0. 



jG7(i) s=l 



In order to obtain a stable numerical method we add to (15) some sta- 
bilization terms which vanish for a smooth solution w. In order to define a 
well-posed scheme we have to linearize the viscous terms Rs{w,Vw). From 
(2) we obtain 
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R2 {w, Vw) 

I 1 f dw 3 

I Rewi ydxi 



W 3 dwi 
wi dx\ 



dw 2 W 2 dwi 

dx 2 wi dx 2 



Wi ^ 



Wi 2. 



dw-[ \ 


f dw 2 




dw\ 


5X2 J 


V 5xi 


_ 

W\ 


5xi 


7 


r dw 4 , 


W 4 


dwi 


Re Pr w\ 


5x1 


Wi 


5X2 









iwl + wf) 



where = R^^\w,Vw) denotes the r-th component of Rs (s = 1,2, r = 
2,3). 

Now for w = (tci, . . . , ^4)'^ and (p = (</5i, . . . , <^4)'^ we define the vector- 
valued functions 



Di (w, Vw, ip, Vip~) 

fo 

2 1 I 2 f 9(f2 _ !£ 2 _ _ ( d(fs _ ^ dw-j \ I 

3 Rewi I I wi dx\ J 5x2 '*^1 5x2 Jj 



1 f 5y?3 (£^ dw-j \ I / 

Rewi I 5xi wi dx\ J ' \ 



5 v ?2 ^£ 2 _ dwi 

5x2 Wi 5x2 



W 2 

Wi 1 



— -I 7 5y?4 _ (£4 dw\ 

ui 1 ' RePrwi dx\ w\ dx\ wi 



-2 iff +-3 iff 



+ ^ {W 2 V 2 + Wzpi) ^ 



D2 (w, Vw, ip, Vip) 

/O 



_1 r / 5 v ?3 _ <£^ dw\ \ I f d(p 2 _ £ 2 _ dwi \ 1 

\ewi I I 5xi wi 5xi J ' I 5x2 wi 5x2 J\ 

1 2 ( 5<P3 i£2. dwi j f d(p2 j£2_ dwi I 

Rewi I 5x2 5x2 J I 5xi wi dx\ J 



t£a.£)(2) 



^DX 

wi 2 



7 5(^4 _ ££ dwi 

RePrwi 5x2 'ii^i 5x2 wi 



^2iff+«^2iff 



+ :^{W2+P2 + W 3 P 2 ) iff 



where denotes the r-th component of Dg (s = 1, 2, r = 2, 3). Obviously, 
Di and D2 are linear with respect to ip and Vip and 



Dg {w,Vw,w^Vw) = Rs{w,Vw), s = l, 2 . 



The definition of Ds, 5 = 1,2 can be given in other forms. We only require 
that they are linear with respect and satisfy (18). The natural way, how to 
perform the linearization of the diffusion terms, follows from (16) where the 
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space derivatives of w are simply replaced by the derivatives of (f. This lin- 
earization gives also the dependence on the gradient of the first component of 
(f. However, numerical experiments carried out with the aid of this lineariza- 
tion do not yield satisfactory results. Therefore we use (17) which gives terms 
T>s, s = 1, 2 independent of V(fi. For more detail see [7]. 

We add the following terms to the left-hand side of (15): 

r ^ 

y] y] / '^{Ds{w,Vw,(pyip)){riij)s ■[w]dS ( 19 ) 

iel s=l 

3<i 

r ^ 

+E E / E^ s{w,Vw,ip,\/ip){riij)s -wdS. 

iei iG7(b s=l 



In the second term we use the zero natural Neumann boundary conditions 
(7), a)-b) and the Dirichlet conditions are taken into account with the aid of 
additional terms on the right-hand side of (15). 

Moreover, to the left-hand side of (15) we add the vanishing interior penalty 
terms 




a[w] • [(f] dS 



( 20 ) 



j<i 



with cr\rij = boundary penalty terms balanced by additional 

right-hand side terms containing the Dirichlet boundary data. 

We arrive at the definition of the following form 



Ah{w, ^) = Y1 ^ 

S 

-EE L h^ s{w,'Vw)) {riij)s ■ [<f\ dS 
iel j€s(i) ''^'3 S =1 

j<i 

r ^ 

+E E / 

iel s=i 

j<i 

r ^ 

-E E / Yi,Rs{w,Vw){nij)s-‘pdS 

i€l j67D(i) 

r ^ 

+E E / E^ s{w, Vw, tp, Vip) {riij)s {w - wb) dS 

iel jeiD(i) s=i 

+ X! E / + ^ y] f a(w - wb) ■ <pdS. 

j<i 
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The boundary state wb will be defined later. The convective terms are repre- 
sented by the form 

Bh{wh,<fh) = f (21) 

iei s=i 

+E E / ,Wh\rji,Tiij) • Wh,v>h e H^{0,Th)'^, 

iel jGS{i) 

where JFf is a suitable numerical flux commonly used in the finite volume 
method. We use the numerical flux based on the direct solution of the local 
Riemann problem, see [10]. If Fij C dQh, then there is no neighbour Kj of K{ 
adjacent to Fij and the values of Wh\rij must be determined on the basis of 
“inviscid” boundary conditions, see [9]. 

The boundary state wb = {wbi,> ■ • ,wb 4 )^ is determined in the following 
way: We set 

WBr\rij = w*\r,^, (22) 

if the r-th component Wr of w is prescribed on Fij . Here w* is the r-th compo- 
nent of w* : Qt — > which is a function satisfying the boundary conditions 

(5) - (7). Otherwise, we set 

= '^r\rij, ( 23 ) 

which means that we use the “extrapolation” of Wr onto Fij from Ki E Th. In 

particular, we have 

'Wb ^ {pij,0,0,pij9ij) on Av, (24) 

Wb ^ (^PD, Pdvdi, PDVD2, PijOij + ~pd\vd\‘^^ on T/, 

where po and vjj = {vDi,yD 2 ) are the given density and velocity from the 
boundary conditions (5) - (7) and pij, Oij are the values of the density and 
absolute temperature extrapolated from Ki onto Fij. 

Now the discrete DGFE Navier- Stokes problems read: 

Definition 1. An approximate DGFE solution of the compressible Navier- 
Stokes problem (GFP) is defined as a vector- valued function Wh such that 

а) WheC\[0,T]-Sh) ( 25 ) 

б) ^ (wh{t), ifif,) + Bh (whit), iph) + Ah {wh{t), (fh) = 0 

'i<Ph^Sh, t£ (0,T), 

c) u;;i(0) = Wh, 

where is an S h~ approximation of . 

The problem (25) exhibits a system of ordinary differential equations which 
can be solved with the aid of a suitable ODE solver. 
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4 Numerical examples 

We implemented the numerical scheme (25),a)-c) with the aid of a piecewise 
linear approximation on regular triangular grids. The system of ODE was 
solved by the explicit Euler method. 

Now we consider four cases of viscous flow around the profile NACA0012 



with the following data, see [3]: 


case Min o; Re 


case Min Re 


Cl 0.80 10° 500 


C3 0.85 0° 500 


C2 2.00 10° 106 


C4 0.85 0° 2000 



where Min is the far field Mach number, a the angle of attack and Re the 
Reynolds number. We compare our results with the numerical results pre- 
sented in [3], where ten methods were applied. The following table contains 
our computed lift cl and drag cr coefficients in comparison with [3] (#7^ 
denotes the number of elements of the mesh Th) 



computed values reference values from [3] 

case #7k cl cd cl cd 

(range); mean value (range); mean value 
Cl 4563 0.4985 0.1938 (0.4199 - 0.5170); 0.4526 (0.1597 - 0.2868); 0.2559 

C2 5640 0.3969 0.4172 (0.3063 - 0.4059); 0.3443 (0.4120 - 0.4910); 0.4660 

C3 4946 0.0003 0.2304 (0.0000 - 0.0007); 0.0001 (0.1790 - 0.2420); 0.2192 

C4 4946 0.0001 0.1179 (0.0000 - 0.0002); 0.0001 (0.1012 - 0.1360); 0.1171 



The computed values of drag and lift correspond to the reference values 
from [3]. Figures 1 shows the employed triangulation and the computed isolines 
of the Mach number for the case C2. 
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Fig. 1. Viscous flow along NACA 0012, case C2, triangulation (top), isolines of the 

Mach number (bottom) 
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Summary. We introduce a new finite volume scheme for the discretization of the 
incompressible Navier-Stokes equations on general meshes, for which we prove con- 
vergence without any condition on the regularity of the solution. Numerical results 
are presented. 



1 The incompressible Navier-Stokes equations 

Numerical schemes for the Navier-Stokes equations (1) have been extensively 
studied: see [7, 11, 12, 13, 8, 6, 15] and references therein. An advantage of 
the finite volume schemes is that the unknowns are approximated by piece- 
wise constant functions: this makes it easy to take into account additional 
nonlinear phenomena or the coupling with algebraic or differential equations, 
for instance in the case of reactive flows. In [11] is presented the classical finite 
volume scheme on rectangular meshes, which is the basis of many industrial 
applications. A convergence proof of the so-called MAC scheme is given in 
[10] in the case of a uniform rectangular grid. However, the use of rectangular 
grids limits the type of domain which can be gridded, and more recently, finite 
volume schemes for the Navier-Stokes equations on triangular grids have been 
presented: see for example [9] where the vorticity formulation is used, and [2] 
where primal variables are used with a Chorin type projection method (but 
no proof of convergence is known). Here, we propose a new method using the 
primitive variables and enforcing the divergence condition directly, using quite 
general meshes such as mixed rectangular-triangular or Voronoi' meshes, and 
for which we are able to prove convergence under general conditions (in par- 
ticular, no regularity of the exact solution is required). An error estimate in 
the case of the linear Stokes equations was presented in [4] . 

We seek an approximation of G i7o(C) x Hq{Q) and 

p G L^(i?) , weak solution to the incompressible generalized Navier-Stokes 
equations: 

+ dip + -f- = f^^^ in i7, for i = 1, 2, , ^ 

+ d 2 U^‘^^ = 0 in j?. 
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where rj > 0, and are the two components of the velocity, p denotes 
the pressure, u the viscosity of the fluid, under the following assumptions: 

i? is a polygonal open bounded connected subset of R^, (2) 

ly G ( 0 ,+oo), Tj G [ 0 , +00), ( 3 ) 

fori =1,2. (4) 

The terms appear when considering an imphcit time discretization of 
the unsteady Stokes or Navier-Stokes equations (with 77 as the inverse of the 
time step, 77 = 0 yields in the steady-state. 

We prescribe for both problems a homogeneous Dirichlet boundary condi- 
tion on the velocity Let us denote by x = any point of 

Q and by dx the 2-dimensional Lebesgue measure dx = dx^^^dx^^^\ 

Definition 1 (Weak solution). Under hypotheses (2) -(4), u = 
is called a weak solution of (1) if and only if 

' u = G E{Q)^ 

77 f u^'^\x)v^'^\x)dx -b jy f ’ Wv^'^\x)dx b{u^u^v) = 

f f^'^\x)v^'^\x)dx, Vu = G E{Q), 

( 5 ) 

where the trilinear form b is defined for all u^v^w G (iJo(i7))^ by 

EE/ u^'^\x)diV^^\x)w^^\x)dx^ (6) 

/c=l,2 i=l,2 

which classically satisfies, for all u ^ E{f2), 

6(u,n,.)= EE L {x)dx. 

k=l,2 i=l,2 

2 The finite volume scheme 

Definition 2. [Admissible discretization] Let Q be an open bounded polyg- 
onal subset o/R^, and dQ — Q\Q its boundary. An admissible finite volume 
discretization of Q, denoted by V, is given by V = (M,S,V,V), where: 

— M. is a finite family of non empty open polygonal convex disjoint subsets of 
Q {the ‘‘control volumes”) such that Q — Eor any K G A4, let 

dK = K \ K be the boundary of K and m{K) > 0 denote the area of K. 
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S is a finite family of disjoint subsets of f2 {the ‘‘edges’’ of the mesh), such 
that, for all a G S, there exists a hyperplane E of and K £ Ai with 
a — dK n E and a is a non empty open subset of E. We then denote by 
nia > 0 the 1-dimensional measure of a. We assume that, for all K G A4, 
there exists a subset Sk of E such that OK = then results from 

the previous hypotheses that, for all a G E, either a C dO, or there exists 
{K, L) G with K ^ L such that K C\ L = a; we denote in the latter case 
a = K\L. 

V is a family of points of Q indexed by M, denoted by V = {xk)kgm- The 
coordinates of xk ore denoted by x^^ , i = 1,2. The family V is such that, 
for all K G M, xk € K. Furthermore, for all a G E such that there exists 
{K,L) G with G = K\L, it is assumed that the straight line (xk^xl) 
going through xk ond xl is orthogonal to K\L. For all K G M. and all 
G G Ek, let Za be the orthogonal projection of xk on g. We suppose that 
Za G G. 

V is a finite family of non empty open polygonal disjoint subsets of Q 
{constituting the “dual mesh” of M), which are centered around the ver- 
tices {xs)s=i,Nv the following way {Ny is the number of vertices): 

for 1 < 5 < Nv, let Ms C M. be the set of control volumes to which Xg is a 
vertex. For K G Ms, denote by gk,s,i ond gk,s ,2 ^ Ek the two edges of K 
with vertex Xg . Define Kg as the convex hull of the four points 

(Xs, XX, 5 ^0-K,s,2) • 
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The dual cell around Xg, denoted by S, is then defined as {also see Figure 

1 )^ 



S — ^KeMs^s> 

Since there is a one-to-one mapping between the set C N and 

the set V; we shall replace all subscripts s by S when dealing with the dual 
mesh. Let Vk denote the set of vertices of a given control volume K . Note 
that: 



K — and Kg = K n S. 

The size of the discretization is defined by: 

size(D) = sup{diam(i^), if G Ad}. 

The regularity of the mesh is defined by 

angle(P) = inf {|2^S^s|, \z^x^k\,K & M, S e Vk, € Sk^Ss} , (7) 

where \x^\ designates the absolute value of the measure of the angle (note 
that z^x^s = f - 

For all if G Ad and a G Sk, we denote by nK,a the unit vector normal to 
a outward to K. We denote by dK.a the Euclidean distance between xk and 
(j. We then define 

rucj 

3 . 

d'K,a 

The set of interior (resp. boundary) edges is denoted by (resp. i^ext), 
that is Sint = {cr E S; a ^ df2} (resp. Sext — {cr E S; a C df2}). For any 
O' G Sint^o = K\L (resp. o E Sk)^ let Xa- be the center point of the line 
segment [xk^l] (resp. [xx^a])’, and x^^^ and x^a^ its coordinates. 

For all if G Ad and all S E Vk, let ai and (72 G Sk H Ss numbered such 
that {x^ai — X^g^){x^a^ — x^g^) — {x^a} ~ X^^){x^a} ~ ^5 > 0. 

Let — x^a} — x^a 2 and — x^^ . 

Definition 3. Let Q be an open bounded polygonal subset ofM.^, with AT G N*. 
Let V = (Ad,^,7^, V) be an admissible finite volume discretization of Q in the 
sense of Definition 2. We denote by Hx>{L2) C LS‘{Q) the space of functions 
which are piecewise constant on each control volume K E Ai. For all w E 
Hx>{Q) and for all K E Ai; we denote by wk the constant value of w in K 
and we define {wa)aes by: 



W(j — 0, V(J G <^ext 



( 8 ) 



and 

TK,a{Wa ~ Wk) + TL,a{Wa ~ Wl) =0, Vct S fint, = K\L. (9) 
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Let Lx>{L2) he the space of functions which are piecewise constant on the 
domains S, for all S eV. Let divx) : Lx>{f^) he defined hy: 

divi,(u)(x) = — ^ ^ for a.e. X e S,\/S gV. 

We then set — {u G divp(u) = 0}. For (v^w) G 

we denote hy 



[v,w]x>= ^ TK,a{Va -VK){vJa -Wk), (10) 

KgM creSK 

Remark that thanks to (9), one has: 

[v,w]-r,= ^ t„{vk - vl){wk - wl) + ^ t„vk,wk,, 

aeSiryt, (^=K\L 

where denotes the control volume to which a is an edge. We define a norm 
in Hx>{L2) {thanks to the discrete Poincare inequality (11) given helow) hy 

\w\v = {[w,w]t>)^^‘^ . 

Similarly, for u — {u^^\u^‘^^Y G {Hx>{0)Y, v = G {Hj){0))‘^ and 

w = {w^^Yw^'^^Y ^ {Hv{L^)Y, define: 



mv = 




and 

\v,w]v= ^ 

i=l,2 

The discrete Poincare inequality (see [3]) writes: 

W'^Wh-^iQ) < diam(i7)|i(;|x>, \/w G Hv{0). (11) 

We only present here a centered finite volume scheme, and refer to [5] for the 
upstream version. Under hypotheses (2)- (4), let V be an admissible discretiza- 
tion of Q. Let A G (0,4-oc). The finite volume scheme for the approximation 
of the solution (1) writes: find u such that 

u G ^p(i7), 

77 / u{x) ' v{x)dx + i/[u,v]x> + hx>{u,u,v) — / f{x)’v{x)dx, Vu G £'x>(i7), 

JQ Jn 

( 12 ) 



where, for u,v and w G Hx^{Q) 
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bv{u,V,w) = Y, ^k]s '^K 

KeM k=l,2 SgVk i=l,2 

4'^ = ^ E M^nS)v^j^\VSGV, k = l,2. 

The trilinear form bx>{u^ v, w) satisfies some continuity properties in (iJp(i?))^ 
(see [5] for the proof). 

Lemma 1. [Continuity of the trilinear form in discrete space] Un- 
der Hypothesis (2), letV be an admissible discretization in the sense of Defini- 
tion 2, leta>0 be such that angle(T>) > a, let Hx>{D) be the space of piecewise 
constant functions defined in 3 and let bx> be the trilinear form defined by (13). 
Then there exists C\ > 0^ only depending on a, such that: 

\bx>{u,v,w)\ < Cl \u\t>\v\t>\w\v^ (14) 

As in the case of the linear problem (see [4]), we use the following penalized 
approximation of (12): 



(13) 



{u,p)e{Hvm^xLv{f2), 

iy{[u,v]qy)- / p{x)div'D{v){x)dx + bj)(u,u,v) = / f{x)’v{x)dx, . . 

JQ JQ ^ 

Vu G {Hv{D))\ 

divj){u) = —A size(D) p, 



3 Convergence of the scheme 

The following proposition gives a sufficient condition for the existence and 
uniqueness of a solution to the scheme (with or without penalization), un- 
der the classical assumption that the data are small, or the viscosity is large 
enough (see [14] Theorem 1.3 page 167 for the continuous case). Note that 
in the continuous case, the “small data” assumption is only required to prove 
uniqueness, not existence. Here, however, this assumption is also required for 
the existence of a discrete solution. Moreover, uniqueness is only proven for 
’’small enough” solutions. 

Proposition 1 [Existence and uniqueness of small discrete solutions 
in the small data case, with or without a penalization] Under hypothe- 
ses (2) -(4), letV be an admissible discretization of f2 in the sense of Definition 
2 and let a > 0 with angle(D) > a. Let Ci be the real value which only depends 
on a, given by (I 4 ) of Lemma 1. Assume that the condition 



( 16 ) 
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is fulfilled. Then there exists one and only one function u G such 

that 



1/21 



\u\v < Cs 



2Ci 



diam(i7)Ci j 



(17) 

and u is solution to (12) and (13) {no penalization), or u is such that there 
exists a function p with {u,p) solution to (15) and (13) for a given X G (0, +oo). 
Furthermore, in the latter case, the following inequality holds: 



A size(P) |b||| 2 (^ 2 ) < diam(i?) 



V ll/^*^l|L2(n) j +CiC3^. 

i=l,2 / 



(18) 



and the function p is unique too. 

Proposition 2 [Convergence of the centered penalized scheme in the 
nonlinear case] Under Hypotheses (2)-(4), let a > 0 be given and let C 2 > 0 
be given by Proposition 1. We assume that the property (16) holds. Let A G 
( 0 , +00) be given and let (P^^^)neN be a sequence of admissible discretization of 
Q in the sense of Definition 2, such that lim size(D^’^^) = 0 and angle{'D^'^^) > 

n^oo 

a, for all n G N. Let {u^'^\p^'^^) G (iJp(n) (i?))^ x L^(n) (1?) be a solution to 
(15); (13); (17). Then there exists a subsequence of the sequence {u^'^^)neN 
which converges in L‘^(Q)‘^ to u, weak solution of the Navier- Stokes problem 
in the sense of (5). If C 2 is taken small enough, the uniqueness property of 
the solution entails the convergence of the whole sequence. 



4 Numerical results 

Experiments with an analytical solution were performed. For the centered 
scheme, the results indicate a rate of convergence of h? for the velocities, and 
better than h^-^ for the pressures in the case of unstructured triangular meshes. 
In the case of rectangular meshes or structured triangular meshes, we obtain 
an order h‘^ for the velocities and better than h for the pressures. For the 
upstream weighting scheme on structured meshes, we obtain an order h^'^ for 
the velocities. 

Some experiments were also carried out for the classical example of the lid 
driven cavity, using triangular meshes. We refer to [5] for these, and shall only 
give here some results on the backward facing step, for a Reynolds number 
equal to 800. This is a well documented case in the literature (see e.g. [1], and 
allows to test the performance of methods with respect to the precision on the 
zones of recirculating flow. The geometrical data of the backward step is taken 
from [1]. We computed the streamlines using a reconstruction of a discrete 
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potential located at the edges cr G of the mesh (see [5]). We present 
in Figure 2 the streamlines in three different cases: starting form the top, the 
first figure is obtained with the centered scheme, using a 25200 rectangular grid 
blocks mesh, the second one with the centered scheme using a 2800 rectangular 
grid blocks mesh, the third one with the upstream scheme using a 2800 rect- 
angular grid blocks mesh, and the two last ones with respectively the centered 
and the upstream scheme for 847 cells. It is clear from these figures that the 
centered scheme is, as one could expect, more precise, but that it becomes un- 
stable for coarser meshes. In fact, for a mesh of 700 cells, the Newton iterations 
do not converge, even when using an under-relaxation procedure. 




Centered scheme, 25200 cells 




Centered scheme, 2800 cells 




Upstream scheme, 25200 cells 




Centered scheme, 847 cells 




Upstream scheme, 847 cells 



Fig. 2. Streamlines for the backward step 



The numerical solution obtained with the centered scheme, using a 25200 
rectangular grid blocks mesh seems to be precise enough (comparing the sep- 
aration and reattachment lengths with those of the literature, see [5]) to be 
used as a reference solution for experiments carried out on coarser meshes. 
This allows to compute a rate of convergence of 

We conclude from these numerical tests that the upstream scheme is too 
diffusive and cannot be used for accurate results, although it has the advantage 
of remaining stable even on coarse meshes. The centered scheme yields accurate 
results for a reasonable number of Newton iterations (typically between 5 
and 15). 
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Future developments will concentrate on the extension to three-dimensional 
meshes and to the time-dependent case. 



References 

1. B.F. Armaly, F. Durst, J.C.F. Pereira and B. Schonung, ‘‘Experimental and The- 
oretical investigation of backward- facing step flow” J. Fluid Mech. (1983) voL 127, 
pp 473-496. 

2. S. Boivin, F. Cayre, J.M. Herard, A finite volume method to solve the Navier- 
Stokes equations for incompressible flows on unstructured meshes, Int. J. Therm. 
Sci., 38, 806-825, 2000. 

3. R. Eymard, T. Gallouet and R. Herbin, Finite Volume Methods, Handbook of 
Numerical Analysis, VoL VII, pp. 713-1020. Edited by P.G. Giarlet and J.L. Lions 
(North Holland). 

4. R. Eymard and R. Herbin, A cell- centered finite volume scheme on general meshes 
for the Stokes equations in two dimensions, 125-128, t.337, 2, 2003. GRAS, 
Mathemat iques . 

5. R. Eymard and R. Herbin, A finite volume scheme on general meshes for the 
steady Navier-Stokes problem in two space dimensions, LAPT Report nO , sub- 
mitted 

6. J.H. Ferziger, M. Peric, Computational Methods for Fluid Dynamics. Springer, 
Berlin, 1996. 

7. V. Girault, P.-A. Raviart, Finite element methods for the Navier-Stokes equa- 
tions: Theory and algorithms. Springer, Berlin, 1986. 

8. M.D. Gunzburger, Finite element methods for viscous incompressible flows, A 
guide to thoery, practice, and algorithms. Computer Science qnd Scientific Com- 
puting, Academic Press 1989. 

9. M.D. Gunzburger and R.A Nicolai’des Incompressible computational fluid dy- 
namics, Cambridge University Press, 1993. 

10. R.A Nicolaides and X. Wu, Analysis and convergence of the MAC scheme II, 
Navier-Stokes equations. Math. Comp. 65 (1996), 29-44. 

11. S.V. Patankar, (1980), Numerical Heat Transfer and Fluid Flow, Series in Com- 
putational Methods in Mechanics and Thermal Sciences, Minkowycz and Sparrow 
Eds. (Me Craw Hill). 

12. R. Peyret and T. Taylor, Computational methods for for fluid flow. Springer, 
New- York, 1893. 

13. O. Pironneau, Finite element methods for fluids, John Wiley and sons, 1989. 

14. R. Temam, Navier-Stokes Equations, Studies in mathematics and its applica- 
tions, J.L. Lions, G. Papanicolaou, R.T. Rockafellar Editors, North- Holland, 
1977. 

15. P. Wesseling, Principles of Computational Fluid Dynamics, Springer, Berlin, 

2001 . 




Existence and Uniqueness of a Weak Solution 
to a Stratigraphic Model 



Robert Eymard^, Thierry Gallouet^, Veronique Gervais^ and Roland 
Masson^ 

^ Dept de Mathematiques, Universite de Marne-La-Vallee, Marne-La-Vallee, 
France; eymard@math.univ-mlv.fr 
^ LATP, Universite de Provence, Marseille, France; 

Thierry. Gallouet@cmi. univ-mrs.fr 

^ Institut Frangais du Petrole, Rueil-Malmaison, France; Veronique.Gervais@ifp.fr, 
Roland. Masson@ifp.fr 



Summary. In this paper, we study a multi- lithology diffusion model used to simu- 
late the evolution through time of a sedimentary basin composed of several lithologies 
such as sand or shale. It is a simplified model for which the surficial fiux in lithology 
i is taken proportional to the slope and to a lithology fraction cf in lithology i at 
the top of the basin with a unitary diffusion coefficient. Thus, the sediment thickness 
variable satisfies a linear parabolic problem and decouples from the other unknowns. 
The remaining equations couple, for each lithology, a first order linear equation for 
the surface concentration c| with a linear advection equation for the basin concen- 
tration, for which cf appears as an input boundary condition at the top of the baain 
in case of sedimentation. The existence and uniqueness of a weak solution in is 
proved for this problem. 



1 Introduction 

Thanks to recent progress in geosciences, the process of sedimentary basin infill 
is generally well understood today, and is often considered as the response to 
the interaction between three main processes : the available space created in 
the basin by sea level variations, tectonic, compaction ...; the sediment supply 
(boundary fluxes, sediment production); and the transport of the sediments at 
the surface of the basin. This interaction is essentially treated in a qualitative 
manner using field data, such as seismic or well data, but these informations 
can be difficult and expensive to get. Thus, a 4D numerical model of the basin 
appears as a powerful tool to solve the problem, and stratigraphic models are 
developed to answer the need for quantifying the sedimentary basin infill. 

A stratigraphic model describes the evolution through time of sedimentary 
basins in terms of geometry and rock properties. These models are suited 
for large scales in time and space (greater that 10 km and 10.000 yr), and 
thus average several geological processes such as transport processes (river 
transport, creep, slumps,...). Descriptions of such models are given in [4], [6], 
[7] and [8]. 
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We consider here the stratigraphic model detailed in [2], in which sediments 
are modeled as a mixture of several lithologies characterized by their grain 
size population (sand or shale for example). The surficial flux of lithology 2 , 
i = is taken as in [7] proportional to the slope of the topography 

h and to a lithology fraction c| defined at the surface of the basin. In this 
paper, the diffusion coefficients are taken equal to one, leading to a simplified 
model in the sense that the sediment thickness variable h decouples from the 
other unknowns, the L concentrations Ui in lithology i inside the basin and 
the L surface concentrations cf in lithology i at the top of the basin, and 
satisfies a linear parabolic problem. The remaining equations, accounting for 
the mass conservation of the lithologies, couple for alH = 1, . . . , L a first order 
linear equation for the surface concentration variable c| and a linear advection 
equation for the basin concentration variable for which c| appears as an 
input boundary condition at the top of the basin. A weak formulation has been 
introduced for this coupled problem (see Definition 1). 

The aim of this paper is then to study the problem satisfied by the concen- 
tration variables, and more especially to state the existence and uniqueness of 
a weak solution in This result is given below in Theorem 1. The proof of 
existence has already been achieved in [1] and will be briefly recalled in sec- 
tion 3. It derives from the convergence of an implicit finite volume scheme. The 
uniqueness will be obtained using the linearity of the coupled problem in the 
concentration variables, the existence of a weak solution to the adjoint system 
and two integration by part formulae for the non smooth solutions of the direct 
and adjoint problems. Then, the paper outlines as follows : the mathematical 
model and its weak formulation are described in section 2, and the proof of 
existence and uniqueness of a weak solution is achieved in section 3. 



2 Mathematical Model 

We consider in this paper the model defined by Eymard et al. in [2] in a sim- 
plified case for which the diffusion coefficients of the lithologies are taken equal 
to one. Furthermore, the sea level variations and the ground distortions are 
not taken into account. 

Let us denote by h(x^ t) the sediment thickness variable, function of time 
t > 0 and of X G i? C d = 1 or 2, the horizontal extension of the 
basin. The sediments are modeled as a mixture of L immiscible lithologies, 
such as sand or shale, characterized by their grain size population, and con- 
sidered as incompressible materials of constant grain density and null poros- 
ity. Inside the basin, the mixture is described by its composition given by 
the L concentrations Ci{x,z,t) > 0 in lithology i defined on the domain 
B — {{x,z,t)\x e t > 0^ z < /i(x,t)}, and satisfying 
sediments transported by the surficial fluxes, i.e. deposited at the surface in 
case of sedimentation and passing through it in case of erosion, are character- 
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ized by their concentrations c|(x, t) > 0, defined on i? x and also satisfying 

Ef=ic| = i. 

Since the sediment fluxes are non zero only at the surface of the basin, no 
change of the sediment composition occurs inside the basin : dtCi = 0 on B. 
The evolution of Ci is then only governed by the boundary condition at the 
top of the basin stating that Ci\z=h — cf in case of sedimentation {dth > 0). 
An initial condition to the basin concentrations is also prescribed : Ci\t=o = 
on {(x, z)\x E Q, z < h^{x)}. 

Let us now consider these equations in the new coordinate system (x, t) = 
(x', /i(x', t') — z, t') in which the vertical position of a point is measured down- 
ward from the top of the basin, and let us define Ui{x^ t) = Ci{x, h(x, t) — t) 
on f? X X ii^(x,^) = c^(x, h^{x) — on i? x Then, we get the 
new problem : 

{ dtUi -f dth d^Ui == 0 on i? X x JR!^, 

Ui\^=o = cf on = {(x, t) E O X \ dth{x, t) > 0}, (1) 

Ui\t=o = on Q X IR"^. 

The surficial transport process is the multi-lithology diffusive model introduced 
in [7] for which the flux of lithology i is taken proportional to the slope and 
to the surface concentration : (pi = —c^kiVk. The coefficient ki > 0 is the 
diffusion coefficient of the lithology z, chosen equal to one in this paper for 
all i = 1,...,L. Therefore, the sediment thickness variable decouples from 
the other unknowns and satisfies a linear parabolic equation as we shall see 
later. Then, the model accounts for the conservation of the fraction Ad^(x, t) = 
lithology z, stating that 

f Ui\^=o dth -h div {-cl V/z) = 0 on i? x iR!j_, 
l Ef=ic| = 1 on 12 X Ml. 

In this equation, Ui\^=:odth is formally equal to dtMi thanks to (1). A Neu- 
mann boundary condition is imposed to h on dH x JR!!j_ : Vh • n\dnx]R’^ — 9^ 
with n the unit normal vector to df^ outward to 17, as well as the initial 
condition h\t=o = h^ on f2. Finally, Dirichlet input boundary conditions are 
prescribed to the surface concentrations : c||x’+ = Ci on ^ 

^17 X I g{x,t) > 0}. 

Summing the first equation of (2) for alH = 1, . . . , L, it appears that, for 
this simplified model, the sediment thickness variable decouples from the other 
unknowns and satisfies the linear parabolic equation 

{ dth — Ah = 0 on 17 X iR;^, 

Wh-n\df 2 xRi = 9 on dnxM^, (3) 

h\t=o = h^ on 17. 

The solution of this problem is then used in the remaining equations (4) ac- 
counting for the mass conservation of the lithologies. They couple for each 
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lithology a first order linear equation for the surface concentration c| with a 
linear advection equation for the concentration Ui^ for which c| appears as an 
input boundary condition at the top of the basin in case of sedimentation : 



< 



Ui\^=o dth + div(-cfV/i) = 0 on x 

c||r+ = on ^ X > O}? 

dtUi + dthd^Ui = 0 on i? X ]R\ x ( 4 ) 

= cl on = {{x,t) e O X \dth{x,t) > 0}, 
Ui\t=o = on n X 



In the sequel, the following hypothesis are made on the data : 

Hypothesis 1 

(i) Q is an open hounded subset of of class 

(ii) G g G C^{df2 x iR+) fl L‘^{dQ x and g, are chosen so 

that the unique solution h of (3) is in C^{Q x [0,T]) for all T > 0, 

(Hi) Ci G with Ci >0 for i = 1, . . . , L, and Ci = 1, 

(iv) u^ G X u^ >0 for z = 1, . . . , L, and 

(v) For all T > 0, the boundaries and dE^ of the sets E^ = {{x,t) G 
df2 X (0, T) I g{x, t) > 0} and E^ = {(x, t) G dQ x (0, T) | g{x, t) < 0} are the 
union of a finite number of manifolds of dimension at most d — 1, 

(vi) For all T > 0, the boundaries dVj, and dV^ of the sets G 

i? X (0, T) I dth{x^ t) > O}, and = {{x,t) G i? x (0, T) | dth{x^ t) < 0} are 
the union of a finite number of manifolds of dimension at most d. 

We shall also denote by C the operator C = dt + dth and by C^{1R^) 
the space of real valued functions {p G C^{1R^) \ Supp((/?) bounded in M^}. 

To cope with the difficulty to define the trace of the basin concentration 
at the top of the basin, we are looking for weak solutions defined as follows : 

Definition 1. Let us assume that Hypothesis 1 holds, and let h denote the 
solution of (3). Then (ui,cl) G x x iR^) x L^{f2 x is said to 

be a weak solution of problem (4) if it satisfies : 

(z) for all (pe {(t)e \ (/)(, 0, .) = 0 on i? x \ D+}, 






[ [ [ [dtp{x,C,t) + dth{x,t)d^p{x,^,t)]ui{x,C,t)dtd^dx 

J f2 J ]R^ J ]R-\- 

/ / n?(x, ^)(/?(x, 0) dx + / / dth{x,t) cl{x,t)(f{x,0,t) dt dx = 0, 

J f2 J ]R-\- J J ]R_l 



( 5 ) 



{a) for all 'ip e {(j) e \ 0 , .) = 0 on dQ x \ E^}, 
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- / / / [dt'ip{x,^,t) + dth{x,t)d^ip{x,^,t)]ui{x,^,t)dtd^dx 

J f2 J iR_|_ J iR-j_ 

-[ [ u°{x,^)'ip{x,^,0)d4dx+ [ ( [ Ci(x,t)Vh{x,t)-V'>p{x,0,t)dx 

J ^2 J ]R-\- J JRj^ \ tJ {2 

— / Ci{x,t) g{x^t) 'ip{x,0^t)d'y{x) \dt = 0. 

JdQ ) 



( 6 ) 



Then, the main result of this paper is the following Theorem. Its proof will be 
developed in the next section. 

Theorem 1. Assuming that Hypothesis 1 holds, there exists a weak solution 
{ui, c|) G X X X X JR^^) for all i £ L} to problem 

(4) in the sense of Definition 1, and ui is unique. 



3 Existence and Uniqueness of a Weak Solution 

The aim of this section is to prove Theorem 1 . The proofs of Lemmae 1 and 2 
used in the sequel are technical and will be detailed in a forthcoming paper. 

The existence of a weak solution {ui , c| ) G ( J? x x ) x ( Q x ) 

to (4) in the sense of Definition 1 is obtained by convergence of an implicit 
finite volume scheme for the model, and has already been proved in [1]. Let us 
just recall the main stages of this proof. 

In the sequel, we shall consider admissible finite volume meshes defined as 
follows : 

Definition 2. Let Q be a bounded domain of d — 1 or 2. An admissible 
finite volume mesh of O for the discretization of problem (3) -(4) is given by a 
family of “control volumes”, denoted by JC, which are open disjoint subsets of 
Q, and a family of points of Q, denoted by V, satisfying the following proper- 
ties : 

1. The closure of the union of all the control volumes of JC is Q. 

2. For any k, k' £ JC with k, ^ k' , either the {d — 1)- dimensional measure of 
R C\ R' , denoted by m{R H R'), is null, or it is strictly positive and R D R' is 
included in an hyperplane of In the following, we will denote by Hint the 
family of subsets a of D contained in hyperplanes of with strictly positive 
measures, and such that there exist £ JC with m{RnR') > 0 and a = RDR' . 
We shall also denote by £ Hint the edge between the cells n and k'. 

3. The family V = {xk,)k£K: 'Is such that x^ C R (for any £ JC), and, if 

(7 = , it is assumed that x^, x,^> and that the straight line going through 

and x^f is orthogonal to the edge a. We shall denote by d[K, k') the distance 
between the points x^, and x ,^> . 

4. For any n £ JC, there exists a subset H^ of Hint such that 8 k \ dQ = 
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R\{k,U df2) = 

We shall denote by (/C, Eint^V) this mesh, and by SJC = sup {diam{K), k E JC} 
its size. 

Let (/C, Eint^V) be an admissible mesh of i?. The time discretization is denoted 
hy n E IN , such that = 0 and At'^'^^ = The superscript 

n, n E IN , will be used to denote that the unknowns are considered at time 
For each control volume k E 1C and each time n > 0, shall denote 
the approximation of the sediment thickness on k at time z E 

(— oo,/iJJ), the approximation of the basin concentration ci in lithology i in 
the column {{x,z)\x E n,z < h{x,t^)}, and the approximation of 

the surface concentration in lithology i at time on k. Then, (3)-(4) is 
discretized by a fully implicit time integration and a finite volume method 
with cell centered variables. For the computation of the fluxes at the edges of 
the control volumes, the discretization uses an upstream weighted evaluation of 
the surface concentrations. The approximate concentration is the solution 
at time of the conservation equation 

dtCi^^{z,t) = 0, for all <t < < h^{t) = + (t - f") , 

= cl^{z) for all z < /i”, 

and — ^) for all ^ > 0. One can refer to [2] or [1] for the 

complete numerical scheme. 

Then, the approximate sediment thickness satisfies an implicit 

finite volume numerical scheme for the parabolic problem (3) for which exis- 
tence, uniqueness and error estimates have already been proved in [3]. Con- 
cerning the concentration variables, we show the existence of solutions to the 
discrete problem bounded in the interval [0, 1], which are unique except for the 
surface concentration variables For any admissible mesh (/C, Eint^V) of 

Q, any time step Z\t > 0, and i = 1, . . . , L, let us define the piecewise constant 
functions hjc^Ati cf on i? x and Ui^jc^At on i? x x by 

for all X E K,, K E 1C, t E n > 0, ^ G iR!j_, where /i^, and 

are any given solution of the discrete problem with bounded in the in- 

terval [0, 1]. For all m E JN, let (/C^, E^^, Vm) be an admissible mesh of i? and 
Atm > 0, and let us assume that Atm 0, dtCmj Atm — > 0 as m — > oo and 
that there exists a > 0 such that, for all m E JN, maxcrei:^. NKm < Let 

’ v_ , d{K,K') — 

a—K.\ K.' 

hKrr^^Atrrr,^ Km Atm ^^fincd by (7) wlth JC = ICm aud At = Atm- 

Then the stability of the discrete concentrations gives the convergence, up to 
a subsequence, of {ui^)CmAtm^^i,KmAtm^'^^^ weak-^ topology 

as m — > oo. To prove that the limit {ui, c|) is a weak solution of (4), we finally 
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use an interpolation in time of the approximate basin concentration 
which converges towards ui in and a weak-BV estimate for the flux terms, 
which is an adaptation to the coupling of a parabolic and an hyperbolic equa- 
tion of the result proved in [3] for the coupling of an elliptic and an hyperbolic 
equation in the case of a two-phase Darcy flow. The detailed proof is given in 
[!]• 



Let us now show the uniqueness of Ui. It is achieved using mainly the 
linearity of (4) in the concentration variables and the adjoint problem on i? x 
(0,T), T > 0. 

For any given surface concentration c\ G L^{f2 x we have first stud- 

ied the weak formulation (5) of the linear advection equation Cui = 0 with 
input boundary condition c| on and initial condition Using the charac- 
teristic solution of this problem (see [5]), we have proved the following Lemma, 
in which an integration by part formula for the solutions of the advection equa- 
tion and its adjoint problem is stated : 

Lemma 1. Hypothesis 1 is assumed to hold. Then, for any time T > 0, any 
functions f G x x (0,T)), P G and G L^(i? x Ml), 

the equation 



Cv = f on Q X Ml X (0,T), 
v\^=o = l^ on 
v\t=o = on Q X Ml, 



( 8 ) 



has a unique weak solution in L°°(i7 x Ml x (0,T)) in the sense that for 
all G {(/) G {M^^‘^) I (/)(., 0, .) = 0 on f2 X (0, T) \ and ., T) = 
0 on i? X Ml}, one has 

InljR lo 

-h / / v^{x,^) (p{x,^,0) d^dx / / dth{x,t) l^{x,t) (f{x,0,t) dtdx = 0. 

Jn JiR+ Jn Jo 

( 9 ) 

The weak solution v of (8) has a trace on t = T in x Ml), and the 

function vdth has a trace on ^ — 0 in x (0,T)), such that for any 

^ G C^{M^~^‘^) one has 

[ f f UT(p)v + f(p){x,^,t)dtd^dx+ ( [ (v{x,C,0)cp{x,^,0) 

Jn J]R+ Jo ^ '' ^ Jn J]R+ ^ 

-v(x, T) (f{x, T)^ d4 dx + II dth{x, t) v{x, 0, t) cp{x, 0, t) dt dx = 0. 

( 10 ) 

Let T > 0 and w be the weak solution in x Ml x (0,T)) of the adjoint 

equation 




Existence and Uniqueness of a Weak Solution to a Stratigraphic Model 285 



—Cw = r on Q X x (0,T), 

onV:^, ( 11 ) 

w\t=T = on Q X 



defined in a similar way as above with r G L^{f? x x (0,T)) a compactly 
supported function on D x x [0,T], G x iR^) a compactly sup- 

ported function on Q X IRj^, and G L^{f2 x (0,T)). Then, one has 



[ [ [ (v (Cw) {Cv)w^{x,^,t) dtd^dx - f f (v{x,^,T)w{x,^,T) 

Jn JM+ Jo ^ T ^ 

-v(x,^,0)w{x,^,0)^d^dx-{- j J dth{x,t) v{x,0,t) w{x,0,t) dt dx = 



0 . 

( 12 ) 



Let us denote by {vi,dl) the difference between any two weak solutions of 
(4). From the linearity of the set of equations (4) in the concentration variables, 
the functions (vi,df) satisfy the weak formulation (5)-(6) with homogeneous 
boundary and initial conditions. 

Let T > 0. From Lemma 1, the function Vidth has a trace at ^ = 0 in 
L°°(i7 X (0,T)) denoted by Then, from the integration by part 

formula (10) of Lemma 1 and the weak formulation (6), it results that for all 
(f G {4> G I (j){x,t) = 0 on df2x {0,T)\E^, and (f){x,T) = 0 on i?}, 

one has 



J J (^Vi{x,0,t) dth{x,t) (p{x,t)-\-d^{x,t)\7h{x,t)''V(p{x,t)^ dt dx = 0. (13) 



We easily deduce, using (13) and dth — Ah — 0, that 



div{-dtVh) - -Vi\^=odth G L^{f2 X (0,T)), (14) 

Wh^Vdt = {vi\^=o-dt)dth G L^{Qx{0,T)). (15) 



Let us now consider 
' -Wi\^=o dth 

< 



le adjoint system 

div(g|V/i) == 0 
^1\e- = 0 

-Cwi = Vi 

Wi|^=o = qt 

Wi\t=T = Vi\t=‘ 



on i? X (0, T), 
on 

on i? X X (0,T), 
on TJrp , 
on 17 X IR\. 



(16) 



The direct and adjoint problems are very close, apart from the non vanishing 
right hand side Vi G {O x x in the advection equation of (16). Then, 
the existence of a weak solution {wi, d|) G L^{Q x iR^ x ]R*^) x L°°(i7 x iR!^) 
to the adjoint problem, defined similarly as in Definition 1, can be obtained 
in a very close way as in [1] by the convergence of a finite volume numerical 
scheme, adapted to the non vanishing right hand side in in the advection 
equation. 
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Considering such a weak solution, the following equation is derived as above 

dW{qt\/h)=Wi\^=odth e X (0,T)). (17) 

From (15) and (17), the function div(g|df V/i) is defined in x (0,T)) 

and hence the vector field qfd^Vh has a normal trace in L^(0,T; H~i (dO)). 
As formally d\ vanishes on qf vanishes on and the normal trace g of 
Eh vanishes on dQ x (0, T) \ {E^ U E:^), the normal trace of qfdfVh vanishes 
on the boundary dQ x (0,T). We can prove this result stated by the following 
Lemma : 



Lemma 2. Hypothesis 1 is assumed to hold. Then, for any T > 0, any weak 
solutions (wi,qf) of the adjoint problem (16), and {vi,df) of problem (4) with 
homogeneous boundary and initial conditions, one has 

rT 



f f div{q^ dl Eh) dt dx — 0. 
Jn Jo 



According to the definition of the characteristic solution of (11) (see [5]) and 
since the velocity dfh is uniformly bounded on i? x [0,T] for any time T > 0, 
the function Vi (resp. its trace Vi\t=T) is compactly supported in i? x IR^ x [0, T] 
(resp. in X iR+). Applying the integration by part formula (12) of Lemma 1 
to V = Vi and w = Wi, we get that for any time T > 0 

[ [ [ \vi\‘^{x,^,t)dtd^dx d- [ [ \vi\‘^{x,C,T)d^dx 

J J Jo J J JR_|_ 



/ / dth{x,t) Vi{x,0,t) Wi{x,0,t) dt dx. 

Jo Jo 



(18) 



From Lemma 2 and the integration over i? x (0,T) of (17) multiplied by 
we obtain 

dtdx = 0. 



[ [ d^^{x,t)wi{x,0,t) dth{x,t) + ql{x,t) Ed^{x,t) • Eh{x,t) 

Jo Jo *- 



Also, multiplying (15) by qf and integrating over i7 x (0,T), we get 

In fo 0’ 0 - 0) t) dth{x, t)- 

—qf{x,t)'Vdf{x,t) ■ Vh{x,t)]dtdx = 0. 



(19) 

( 20 ) 



Summing (19) and (20) and taking into account the boundary conditions 
= Qi on Vi\^=o = df on and that dth = 0 on i? x (0,T) \ 
{V^ U Vf , ) , we obtain 



J j (wi{x,0,t) d^{x,t) d- Vi{x,0,t) q^{x,t) — d^ {x , t) q^ {x , t)^ dth{x,t)dtdx 
= Vi{x,0,t) Wi{x,0,t) dth{x,t)dt dx = 0. 

Jo Jo 



( 21 ) 
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Equation (21) together with (18) conclude the proof of Theorem 1. 

This theorem, together with the convergence result on the solutions of the 
implicit finite volume numerical scheme seen previoulsy, also give the conver- 
gence of the full sequence of approximate solutions (r^i,A:^,zxt^)m6iV towards 
the weak solution ui of (4). 

The proof will be detailed in a forthcoming paper, and particularly Lemmae 
1,2, and the existence of a weak solution to the adjoint problem. 
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Summary. We propose and analyze an efficient numerical scheme for nonlinear 
degenerate parabolic convection-reaction-diffusion equations. We discretize the dif- 
fusion term, which generally involves a full matrix diffusion tensor, by means of 
piecewise linear nonconforming (Crouzeix-Raviart) finite elements over a triangula- 
tion of the space domain, or using the stiffness matrix of the hybridization of the 
lowest order Raviart-Thomas mixed finite element method. The other terms are 
discretized by means of a finite volume scheme on a dual mesh, where the dual vol- 
umes are constructed around the sides of the original triangulation. Checking the 
local Peclet number, we set up the exact necessary amount of upstream weighting 
to avoid spurious oscillations in the velocity dominated case. Under the regularity 
condition for the triangulation, using a priori estimates and Kolmogorov’s relative 
compactness theorem, the convergence of the scheme is proved. 



1 Introduction 

The contaminant transport equation writes in the form 

^^-V-(DVc) + V-(cv)+F(c)=9, (1) 

where c is the unknown concentration of the contaminant, the function /5(-) 
represents time evolution and equilibrium adsorption reaction, v is the velocity 
field, D is the diffusion-dispersion tensor, the function F(-) represents the 
changes due to chemical reactions, and q stands for the sources. The main 
features of equation (1) are its degeneracy since f3' may be unbounded, the 
possible dominance of the convection term, and the presence of a heterogeneous 
and anisotropic diffusion-dispersion tensor. 
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The convergence of a finite volume scheme for the equation (1) with ~D = Id 
and F = 0 has been shown in [7]. Finite volumes with upstream weighting tech- 
niques are unconditionally stable; however, there are geometrical restrictions 
on the mesh for the discretization of the diffusion term and there is no gen- 
eral prescription how to discretize full tensors. The finite element method for 
degenerate parabolic problems has been studied e.g. in [2]. One can discretize 
full tensors and there are no restrictions on the mesh. However, spurious os- 
cillations may appear in the velocity dominated case or in the presence of 
a reaction term. Hence a quite intuitive idea is to combine finite volume and 
finite element methods, trying to use the “best of both worlds”. In [1], the 
authors introduce a combined scheme for a convection-diffusion equation with 
a nonlinear convection term in two space dimensions. In the presented pa- 
per, we prove the convergence of this scheme for the equation (1) in two or 
three space dimensions. We extend the techniques used in [7] for a scheme 
with negative transmissibilities, general meshes satisfying only the regularity 
assumption, and cases when the discrete maximum principle is not satisfied. 



2 The degenerate parabolic problem 

We consider the equation (1) in a polygonal domain C d == 2, 3 and on 
a time interval (0,T), 0 < T < oo. We set Qt = x (0,T). We impose the 
initial condition by 

c(x,0) = co(x) xGi?, (2) 

and a homogeneous Dirichlet boundary condition by 

c(x, t) = 0 X G dI2 , t G (0, T) . (3) 

We make the following assumption on the data: 

Assumption (A) 

(Al) (3 G C(R), /?(0) =0 is a strictly increasing function such that 

|/3(a) — ^(6)1 > C(3\a — b\ Va, 6 G R , c/3 > 0 , 
or 

(A2) (Al) is satisfied and there in addition exists P G R, P > 0 such that 
\(3{x)\ < C(s in [— P, P], C/3 > 0 and Lipschitz continuous with a constant 
Lp on (— oo,P] and [P, +00); 

(A3) T>ij G L°°{Qt), ID^jl < a.e. in Qt, 1 < i,j < d, Cd > 0, D zs 
a symmetric and uniformly positive definite tensor for almost all t G (0,T) 
with a constant cd > 0, 

D(x, t)r/ • T] > ct>t] t] Wt] eW^ , for a.e. (x,t) G Qt ; 
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(A4) V G L^(0,T; i?)) n L°°(Qt) satisfies V • v = > 0 a.e. in Qt, 

|v • n| < Cyr, Cv > 0 a.e. on £ x (0,T) for each hyperplane i C f2 with 
normal vector n; 

(A5) -F(O) = 0, F is a non decreasing, Lipschitz continuous function with 
a constant Lp 
or 

(A 6) F(0) = 0, F is a Lipschitz continuous function with a constant Lp CLnd 
xF{x) > 0 for X < 0 and x > M, M > 0; 

(A7) q G L‘^{Qt), where q = qscs with cs G 0 < C5 < M a.e. in 

Qt; 

(A8) Co G L^{f2), 0 < Co < M a.e. in Q. 

We now give the definition of a weak solution of the problem (1) - (3). 



Definition 1. (Weak solution) We say that a function c is a weak solution 
of ifceL^{0,T;H^{f2)), (3{c) e L^{0,T‘ and 



— f f /3{c)(ptdxdt — f /3{co)(p{’,0)dx-{- f f DVc-V^^dxdt- 

Jo Jq J q Jo J q 

— f [ cv -V(fdxdt [ [ F{c)(pdxdt= ( f qcpdxdt (4) 

Jo J Q Jo Jo Jo Jo 

for all (p G L^(0,T;Ho(i?)) with pt ^ L^{Qt), = 0. 



3 The combined finite element— finite volume scheme 



We suppose a family of triangulations {Th]h of the domain i7, where each Th 
consists of closed simplices (triangles in the case d = 2, tetrahedrons when 
d = 3) such that f2 = UxeTh define h = diam(RT) and suppose 



that {Th}h is regular: 



Assumption (B) 

(Bl) There exists a positive constant Cp such that 

max <Ct V/i > 0 , 

Ken PK 

where pK is the diameter of the largest ball inscribed in the simplex K . 

We also use a dual partition Vh of i7, such that Q = V^DeVn 
dual volume D associated to the side is constructed by connecting the 
barycentres of every K ETh that contains ap through the vertices of a p. For 
(jp from the boundary, the contour is completed by ap itself, see Fig. 1. We 
denote hy Qp the barycentre of ap, by the set of all interior and by 
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the set of all boundary dual volumes, and by Af{D) the set of all adjacent 
volumes to the volume D. For E G M{D)^ we finally set cfd,e = dD n dE. 

We suppose the partition of the time interval (0, T) such that 0 = to < 
. . . < tn < • • • < tjsf = T and define Atn =tn— tn-i, At = max At^. When 

l<n<W 

Assumption (A5) is satisfied, we do not impose any restriction on At. When 
only (A 6) holds, we suppose: 

Assumption (C) 

(Cl) The maximum time step condition At < is satisfied. 




Fig. 1. Triangles K,L £ Th and dual volumes D,E G Vh associated with edges 
ctd^cte 

We define the following finite-dimensional spaces: 

Xh = {^h C (Ph\k is linear \/K G 

ifh is continuous at Qd,T) G , 

The basis of Xh is spanned by the shape functions (pD, D G Vh^ such that 
Td{Qe) — E EVh, S being the Kronecker delta. We equip X^ with the 
norm 




Definition 2. (Combined scheme) The fully implicit combined nonconfor- 
ming /mixed-hybrid FE-FV scheme reads: find the values dfi, n E {0, 1, . . . , A^}; 
D G T>h, such that 

ci, = T^co(x)dx DeVt\ (6) 

cl=0 DeVl-\ne{0,l,...,N}, (7) 

" EeAr(D) EeAr(D) 

+F{cl) \D\=ql \D\ DeVr\ne{l,2,...,N}. 



( 8 ) 
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In (6)-(8), 



tr j 

Jtn-l J(7L 



Jtn-l ^ctd,e 

unit normal vector of the side ctd,Ei outward to D, and 



v(x, t) • iiD^E with ud^e the 



Qd 



Atn\D\ 



r [ 9(x,o 

•J tn— 1 D 



dx dt. 



Finally, 



if 

if 






'D,E 



< 0 



+ ^d,e(^e ~ ^d) 



^D,E 



— Cl 



+ ^^,e{^D ~ ^e) 



( 9 ) 



Here, a'^ ^ is the coefficient of the amount of upstream weighting, defined by 



:|r 



^D,E = 



max |mm^ 



'^D^E^ l\^D,E 



'D,E\ 



l},0} 



'D,E 



7^0. 



(10) 



Remark 1. (Numerical flux) We can easily see that 0 < ^ < 1/2, i.e. the 

numerical flux defined by (9) ranges from the centered scheme to the full 
upstream weighting. 



Diffusion matrix from the nonconforming method We set 

^D,E = ~ '^^d)o,k D,E eVh , n e {1, 2, . . . , A/'} , 

where 

D^^(x) = f D(x, t)dt n G {1, 2, . . . , AT} , X G i7 . (11) 

Atn Jtrt-l 

Diffusion matrix from the mixed-hybrid method Let us consider the 
problems 



-V • (D^ Vp) = ^ in 17 , 
p = 0 in ^17 , 

at each discrete time tn, with g G L^(l7). Then using the hybridization of the 
lowest order Raviart-Thomas mixed finite element method, one ends up with a 
linear system = G for the Lagrange multipliers yl located in barycentres 
of sides, see [4, Section V.1.2]. Using the analytic form of M’^, we deflne 

W^^E = = ~ X/ VpD)o,ic D,E ^Vh , n £ {1,2, ... ,N] 

KgTh 

where 



xG K , K €Th,nG {1,2,...,N}. 

( 12 ) 
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In the sequel, we shall consider apart the following special case, satisfied 
e.g. when ~D — Id and when there is a maximal angle condition: 

Assumption (D) 

(Dl) All non-diagonal terms of the diffusion matrix are nonnegative, i.e. 
D2),£; >0 yD,EeVjf\D^E Vn G {1, 2, . . . , A^} . 



4 Existence, uniqueness, and discrete properties 



Lemma 1. (Conservativity of the scheme) The scheme (6) -(8) is conser- 
vative with respect to the dual mesh. 

One can easily verify that ^ ^ and n G 

EeM{D) 

{1,2,..., N}. Adding the finite volume discretization of the other terms, the 
assertion follows. 

Lemma 2. (Coercivity of the bilinear diffusion form) We have 

D2)^£;C£; > Cj^\\Ch\\x^ '^Ch^'^Co^D G , Vn G {1, 2, . . . , A"} . 

DeVh EeVh DeVh 

The assertion follows immediately from Assumption (A3) and the subsequent 
uniform positive definiteness of the diffusion tensors (11) or (12). 

Lemma 3. (Estimate on the convection term) We have 

CD ^D,E cd,e > 0 Vc/^ = cd^Pd G , Vn G (1, 2, . . . , A} . 

Devfp* EeJ\f(D) DeVh 

The proof is similar to that in [6] for pure finite volume schemes. 

Lemma 4. (A priori estimate for an extended scheme) Let u G [0,1]. 
We define an extended scheme by 



c?) = ^^co(x)dx DGl?r, 

cf) = 0 D e n G (0, 1, . . . , A} , 




^D,E cIe ^D,E Cf) E + 

Eevjp^ EeAf{D) 



+uF{cl) \D\=uqrf, \D\ G n G {1, 2, . . . , A}. 



(13) 

(14) 

(15) 



Then {cf))‘^\D\ < Ces for all n G {1, 2, . . . , A} with Cqs > 0. 
DeVh 
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Proof: 

We multiply (15) by AtnC^ and sum over D G and n < k to have 
k k 



E mci)-p(.ci-^)]ci\D\+cnj2^^r 



.nil 2 



nW^hWXh 



< 



( 16 ) 



n=l 

k 



<uJ2Atn E clq^\D\+uLpM^Yl E 

n=l n=l 

considering it > 0, Lemmas 2 and 3, the fact that for < 0 or > M, 
F{c^)c^ > 0 follows from Assumption (A5) or (A6)^ and that when 0 < c J < 
M, —F{c^)c^ < \F{c^)\\c^\ < LpM‘^. Let us now introduce a function B, 

B{s) = f3{s)s — / P{t) dr, s G M. One then can derive 

Jo 

B{cl) - = [f3{cl) - P{cl-^)n - r\ m - /3(cr ')] ■ ( 17 ) 



Using that (3 is non decreasing, one can easily show that 
/?(c2)~^)] dr > 0. In view of this and (17), one has 






E E [B{cl)-B{cl-^)]\D\<Y, E \P(Bh)-p{cl-^)]cl\D\, 



n=i DeT>i:[^ 
which yields 



n=l D£V]^^ 



E B{c>h)\D\- E S(c°,)l^l<E E [«)-«-')]cB|i)|. 

Devj^^ 71=1 DeT>i^* 

Using the growth condition on (3 from Assumption (Al j, one can derive B{s) > 
for all 5 G M. Thus, using in addition Assumption (A8)^ 



^ E {c’hf\D\-Mp{M)M<Y^ E [P{cl)-l3{cl-^)\cl\D\. 



Devi 



n=l DeVj^* 



Using the Cauchy- Schwarz inequality, extending the summation over all n G 
{1,2,..., N} and D G T>h in the first right term of (16), and using the Young 
inequality, we have 

EAt„ E «|D|<(EAt„ E(^s)^i^i)"ikiko. < 

D€l>r' ■Del’h 

E^^" E iBhf\B\ + hq\\lQr- 



n=l 



n=l 



DeVh 




Combined Nonconforming/Mixed-hybrid Finite Element 295 



Substituting all the above estimates into (16), we obtain 

k 

.n E E ^tnWclfx, < uM^{M)\n\ + (18) 

+U-T max ^ {clf\D\+ uUqWl^^ + uLfM^T\Q\ , 

considering also (14) and the fact that k was arbitrarily chosen. We now choose 
s = When u ^ 0, this already leads the assertion of the lemma. When 

u = 0, it follows from (18) that = 0 for all D G Vh and all n G {1,2,..., N}^ 
since || • ||x;, is a norm on X^. Thus the assertion of the lemma is trivially 
satisfied in this case. □ 

Theorem 1. (Existence of the solution to the discrete problem) The 

problem (6) -(8) has at least one solution. 

The proof makes use of an induction argument. At each time level. Lemma 4 
is employed. Consequently, on can use the (Brouwer) topological degree argu- 
ment (see [5]). 

Theorem 2. (Uniqueness of the solution to the discrete problem) The 

solution to the problem (6) -(8) is unique. 

The assertion follows from Assumption (Al) and (A5) oi (A6). 

Theorem 3. (Discrete maximum principle) Under Assumption (Dl), the 
solution of the problem (6) -(8) satisfies, for all D G Vh and n G {1, 2, . . . , N}, 

0<c2)<M. (19) 

One sets a transmissibility ^ = ID)2),e “ ^ ^ Af{D). In view 

of Assumption (Dl ) and (10), one has EJ) > 0 for all D G E G N{D), 
and hence one can prove the assertion as in [6]. 



5 A priori estimates 

Theorem 4. (A priori estimates) The solution of the scheme (6)-(8) sat- 
isfies 

MX E < C'ae , (20) 

M, E < Cae, (21) 

’ ’■■■’ ’ DeVh 
N 

CD '^^nllc^llxh — ^ae 



( 22 ) 
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with cJJ = CLud Cae CL coustant independent of h and At. 

DeVh 



The a priori estimates follow from (18) with s = ^ and u = 1 and using 
Assumption (Al) or Assumption (A2). 

Definition 3. (Approximate solution) As the approximate solution of {!)- 
(3) hy means of the combined nonconforming /mixed-hybrid FE-FV scheme, we 
understand: 

(i) a function Ch^At given by c^ G vuith D ^Vh, 

DeVh 

n G {0, 1, . . . ,N}, solutions to (6-8), such that Ch^At 'Is piecewise constant in 
time; 

{it) a function Ch^At given by the values D E T>h, n G {0, 1, , N}, 
solutions to (6-8); and piecewise constant on the dual volumes D G Vh CL'nd in 
time. 

The function Ch^At is piecewise linear on Th and continuous at the barycen- 
tres of the interior sides, whereas Ch^At is piecewise constant on Vh- From the 
a priori estimate (22), it follows immediately that 

\\ch,At - Ch,Ath,QT — >0 as h-^Q. (23) 

Hence, to show the convergence, we can work with Ch^At as in finite volume 
methods. 

Lemma 5. (Time translate estimate) There exists a constant Ctt > 0, 
such that 



j j (c/i, At (x, t + r) - c/i, At (x, i)) 



dxdt < Ctt{r + At) Vr G (0,T) . 



The proof is an adaptation of the technique used in [8] for degenerate parabolic 
equations discretized by a finite volume scheme. It uses the equation (8) and 
the priori estimate (22). 



Lemma 6. (Space translate estimate) Let us define Ch,At by zero outside 
of Q. Then there exists a constant Cst > 0, such that 



J j (4,At(x + C,0 - C/»,At(x,0) 



dxdi<C3t|^|(|^|+/t) 



The proof is again an adaptation of a technique used to investigate finite 
volume schemes. 
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6 Convergence 

Theorem 5. (Strong convergence in L‘^{Qt)) There exist subsequences 
of Ch At ond Ch At which converge strongly in L‘^{Qt) to some function u G 

Proof: 

From Lemmas 5 and 6 and (20), the sequence c^^At verifies the assumptions of 
Kolmogorov’s theorem [3, Theorem IV.25 ], and thus Ch^At converges strongly 
in L‘^{Qt) to some function u G L‘^{Qt)- Moreover, due to Lemma 6, [6, 
Theorem 3.10] gives that this u G L^(0, T; ilg (i?)). Finally, considering (23), 
Oh, At converges to the same u. □ 

Theorem 6. (Convergence to a weak solution) There exist subsequences 
of Ch^At ond Ch^At which converge strongly in L‘^[Qt) to a weak solution given 
by (4). If the weak solution is unique, then the whole sequences Ch^At, Ch^At 
converge to the weak solution. 

The strong convergence of Theorem 5 permits to pass to the limit in the nonlin- 
ear terms. For slightly more restrictive assumptions than (A), the uniqueness 
follows from [9]. 
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Summary. One of the most important problems in numerical simulation is the 
preservation of qualitative properties of solutions of mathematical models. For prob- 
lems of parabolic type, one of such properties is the maximum principle. In [5], Fuji! 
analyzed the discrete analogue of the (continuous) maximum principle for the linear 
parabolic problems, and derived sufficient conditions guaranteeing its validity for the 
Galerkin finite element approximations built on simplicial meshes. In our paper, we 
present the sufficient conditions for the validity of the discrete maximum principle 
for the case of bilinear finite element space approximations on rectangular meshes. 



1 Introduction 



Consider a two-dimensional linear parabolic problem in the classical setting: 
Find a function u G C^’^((0,T) x i?) D C([0,T) x Q) such that 



du d‘^u . 

k=l ^ 



(0,T)xr2, 



( 1 ) 



u = g on [0,T) x df2, and u\t=o = uq in C, (2) 

where i? is a polygonal domain in IR^ with a boundary df2, T > 0 and a 
is a positive constant. In order to guarantee the existence and uniqueness of 
the classical solution u = u{t,xi,X 2 ) = u{t,x), we assume that the functions 
uq : Q ^ ]R, f : {0,T) X Q JR. and g : [0,T) x dQ IR are sufficiently 
smooth. 

The problem (l)-(2) serves as the mathematical model of various physical, 
chemical or even ecological phenomena. It is well-known that the estimation 
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min{0; min u] -\-t min{0; min /} < u{t^ x) < max{0; max u) -\-t max{0; max /} 
A Qt rt Qt 

( 3 ) 

is valid for the solution u (see [7, Theorem 2.1] and [9, p. 79]). Here Qt with t G 
[0, T] stands for the cylinder (0, t) x i? and Ft is the union of its lateral surface 
and its bottom. Formula (3) is called the continuous maximum principle. 

To solve problem (l)-(2) numerically, we use certain discretizations, both 
in spatial and in time coordinates. It is obvious that the validity of the discrete 
analogue of the maximum principle, the so-called discrete maximum principle 
(DMP), is a natural requirement for having an adequate numerical solution. 

The topic of validity of various discrete maximum principles arose already 
30 years ago. Thus, in the works [3] and [4], DMP was formulated and proved 
for the finite difference and finite element approximations, respectively, for the 
second order linear elliptic equations. In particular, in [4], DMP was proved 
in 2D case for the continuous piecewise linear finite element approximations 
under the following geometrical conditions: the angles of triangles in the used 
triangular meshes are not greater than tt/2 {nonobtuse type condition), or less 
than 7t/2 [acute type condition). More results on DMP for the elliptic problems 
can be found in [6] and [8]. 

In [5] , the validity of the DMP for the linear parabolic problem is analyzed: 
the finite element discretization was performed with linear elements on trian- 
gular (simplicial) meshes and the so-called ^-method was applied to the time 
discretization. The discrete analogue of (3) and sufficient conditions guarantee- 
ing its validity were obtained, where one of such conditions was the acuteness 
of the triangulations used. 

To the authors’ knowledge, there is no similar result on the validity of 
DMP for parabolic problems solved with the help of bilinear finite elements 
in space. As far as the validity of DMP on rectangular meshes is concerned, 
we mention the only work [2] in this respect, where the authors considered 
the simplest elliptic problem and showed that the corresponding DMP may 
not hold if the rectangular elements are chosen arbitrarily. They also derived 
sufficient conditions for the validity of DMP. 

In our paper, we give sufficient conditions for the validity of the discrete 
maximum principle for the Galerkin finite element solutions based on the bi- 
linear elements on rectangular meshes for parabolic problems. Similarly to 
the case of triangular meshes, we obtain certain geometrical condition on the 
shape of elements. Namely, we introduce the notion of non-narrow rectangular 
element, which represents an analogue of nonobtuse triangular element for the 
case of triangular meshes. 
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2 Discretization 



2.1 Galerkin Finite Element Discretization with 0-method 

Let i? be a rectangular domain covered by the rectangular mesh IZ^,^ where 
h stands for the discretization parameter. Let Pi,...,Pn denote the interior 
nodes, and Pjv+i, • • • , ^iv boundary ones in IZh. We also define Ns := 
N-N. 

Let be basis functions defined as follows: each (^i is required to 

be continuous piecewise bilinear such that 4>i{Pj) == j = 1, . . . , iV, where 
Sij is the Kronecker’s symbol. It is obvious that these basis functions have the 
properties 



N 

(a) (f>i>0,i = l,...,N, (b) = l in 12. (4) 

i=l 

We denote the space of all possible linear combinations of the basis func- 
tions by and define its subspace = {t’ G \ v\dn = 0}- Based on 
the usual weak formulation of the original problem, the semi discrete form for 
(l)-(2) reads: Find a function Uh = Uh{t^x) such that 

Uh{0,x) = Uq{x), X € Q, (5) 

Uh(t,x) - gh{t,x) €Vq, t&{0,T), ( 6 ) 

and 

j ^^Vhdx + B{uh,Vh) = j fvhdx, '^vh&Vq, t £ (0,T), ( 7 ) 

n Q 

where B{uh^Vh) = ol J^grad Uh • grad Vh dx. In the above, Uq{x) and gh{t,x) 
(for any fixed t) are linear interpolants in i.e., 

N Na 

Uo(x) = '^uo{Pi)(l)i{x) and gh{t,x) = '^g{t,PN+i)(f>N+i{x). 

i=i i=i 

We notice that from the consistency of the initial and the boundary conditions 
(^( 0 , 5 ) == uq{s), s G we observe that g{0^P]sf-^i) = uo{PN-\-i)^ ^ = 

We search for the semidiscrete solution in the form 

N 

Uh(t,x) = ^Ui{t)4>i{x) + gh{t,x), 

i=l 

and notice that it is sufficient that Uh satisfies ( 7 ) for = d>i, * = 1, • • • , 
only. 
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Introducing the denotation 

v^(t) = PjV+l), • • • , g{t, PN+Na)V ^ 

we arrive at the Cauchy problem for the system of ordinary differential equa- 
tions 



M 



dt 



f Kv'* = f , 

v'^(O) == ['Uo(Pi), . . . ,'ao(PAr),5^(0,Piv+i), • • . , 5^(0, Pjv+iVa)]^ 



(8) 



for the solution of the semidiscrete problem, where 



M — 


K — [i^ij]iVxiV5 


f = [fi]Nxl, 


(9) 


J 


Kij = P(0j, (f>i), 


fi= f f4>i dx. 


(10) 


o 




J 

Q 




The above defined matrices M and K are called 


mass and stiffness 


matrices^ 



respectively. 

In order to get a fully discrete numerical scheme, we choose a time-step 
At and denote the approximation to {nAt) and f (jiAt) by and f , n = 
0, 1, ...,nT {riTAt = T), respectively. To discretize (9), we apply the ^-method 
{6 G [0, 1] is a given parameter) and obtain the system of linear algebraic 
equations 

(M+MiK)v”+i = (M-(l-6>)Z\iK)v”+/iif(”’^\ n = 0, 1, . . . (11) 

where v° = v^(0) and + (1 — ^)f". 

Further, let the matrices M + OAtK. and M — (1 — 6)At'K be denoted 
by A and B respectively. In what follows, we shall use the partitions 



A = [Ao|Aa], B-[Bo|Ba], v" = 



( 12 ) 



where Aq and Bq are square matrices from Aa,Ba G 

[ui, G IR^, and — bf, •••, C IR^^. (Similar parti- 

tion is used for the matrices M and K.) Then, the iteration (11) can be also 
written as 



[Ao|Aa] 



'un+i 

gn+l 



[Bo|Ba] 



+ At 



(13) 



The iteration is well-defined, because Aq is a Gram-matrix, thus it is invertible. 



2.2 Entries of the mass and stiffness matrices 

To calculate the entries of the mass matrix M, we first calculate the integral of 
the product (j)j(l)i only on a single rectangle R (denoted by Mijln). Obviously, 
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it is sufficient to do that on the reference rectangle defined by the vertices 
Pi = (0,0), P 2 = (a, 0), P3 = (u, &) and P4 — (0,6). The four basis functions 
corresponding to the vertices of the reference rectangle are: 

(j>l{xi,X' 2 ) = ^{xi - a){x 2 - b), <t> 2 {xi,X 2 ) = -^Xi{x 2 - b), 

4>3{xi,X2) = ib^lX2, (I^4 {xi,X2) = --^{xi ~ a)x2- 

A simple calculation leads to {Mij\R — Mji\R) 

Mh\r = M22\r = Mss\r = M44\r 
Mi2\r = Mu\r — M2s\r = Ms4\r 

Mis\r = M2a\r 

where ab denotes the area of the rectangle R. 

The elements of the stiffness matrix (integrating only on R) can be obtained 
similarly. Thus, 



ab 

T’ 

ab 

18’ 

ab 



a(26^ — a^) 
6ab ’ 



K..\r = - 



a(a^ + 6^) 



ATmIr-- 



a(2a^ — 6^) 



and any other value is equal to one of the above four numbers. 



2.3 Non- narrow rectangular meshes 

In paper [5], a geometrical condition, the acuteness of the triangular meshes, 
guaranteed the nonpositivity of the off-diagonal entries of K. The situation 
is similar for the case of the bilinear elements. The nonpositivity of the off- 
diagonal entries of K is fulfilled for so-called non-narrow rectangular meshes. 
Let Rh be a rectangular mesh and let us introduce the notation 

max^jai?, 6i?} - 2 min^jai?, 6i?} 

a = max 

ReRh ciRbR 

where aR and bR denote the length of the edges of the rectangle R. A rectan- 
gular mesh 1Zh is called non-narrow if /i < 0. It is called strictly non-narrow 
if /i < 0. Hence, the non-narrowness of a mesh means that the longest edge 
of each rectangle is not greater than \/2 times the shortest one. The non- 
narrowness of the mesh will imply the nonpositivity of the off-diagonal ele- 
ments of the stiffness matrix (see [1], page 254). 
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3 The Discrete Maximum Principle 

Let us define the values 



V 



n 

min 



5 min 

= min{0, u^}, 



for n — 0, riT^ and 



9 max 

^max 



max{0, w? , . . . , 



p{n,n-\-l) 
J min 



min{0, min 

x£^2 ,T^(nAt,{n-\-l)At) 



f{T, a;)}, 



= max{0, 

xEf2,rG{nAt,(n-\-l)At) 



for n = 0 , riT — 1. 

The discrete analogue, the so-called discrete maximum principle (DMP), 
for the continuous maximum principle (3) can be written in the form (cf. [5, 

p. 100]) 



i = 1, . . . , iV; n = 0, ny — 1. 

Let us introduce the denotations 



< max{0,p”+i,t)”^^} + 



(15) 



e = [1, . . . , 1]^ € ]R^, eo = [1, . . . , 1]^ e IR^, ea = [1, . . . , 1]^ S IR^®, 

o(n,n+l) _ r{n,n-\-l) -rpjv f(n,n+l) _ ^(n,n+l) -rpiv 

Amax — Jmax ^ t -Uv. , 1 q — Jmax t irt , 

= /i”ax”+'^ea e ]R^^ 



Vmax = <axe € , 



x 60 



gIR^, 



,ea G 



For simplicity, we denote zero matrices and zero vectors by the symbol 0 , 
whose size is always chosen according to the context. The ordering relation is 
meant elementwise. 

Before proving the sufficient condition of the DMP, in the next auxiliary 
lemmas, some important properties of the matrices M,K, A and B are sum- 
marized. 



Lemma 1. Let the rectangular mesh IZh for Q he of non-narrow type (/i < 0). 
Then RTy < 0 (i ^ j, z = 1, . . . , A'’, j = 1, . . . , iV). 

Proof. We denote suppc/)^ fl supp(/)j by Sij and calculate Kij {i j): 

Kij = a grad(/)j • grad(/>i dx = a / grad^j • grad(/>i dx = E R- 

J *0 Dr“ c JR j:>r~ c 

V — ^ 'ij Lk ^ ij 



Because Kij\R {i ^ j) is nonpositive for any non-narrow rectangle, we observe 
that the off-diagonal entries of the stiffness matrix are nonpositive. ■ 
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Lemma 2. Let the rectangular mesh IZ^ for Q be of non-narrow type (/i < 0) 
and condition 



Bii - Mu - (1 - 6)AtKu >0, i = 1, . . . , AT, (16) 

he satisfied. Then B > 0. 

Proof. The matrix M is nonnegative, because the basis functions are 
nonnegative. Moreover, the previous lemma guarantees the nonpositivity of 
Kij {i ^ j) that implies that the off-diagonal entries of B are nonnegative. 
The nonnegativity of the diagonal entries of B follows from the condition (16). 

■ 

Lemma 3. The relations 

{PI) Ke = 0, (P2) 

are valid. 

Proof. (PI) For the z-th coordinate of the vector Ke, we have 

N / N 

{Ke)i = '£B{^j,4>i) = B['£<Pj,cl>i 

j=i V=i 

grad 1 • grad dx = 0, 

which proves the statement. 

(P2) For the z-th element of we observe that 



(f(n,e)). _ / {{i-0)f{nAt,x) + 9f{{n + l)At,x))(f)i{x)dx< 

Jn 




= /iTax”+'^ E = (M4r+'^) = ((M + = (a4"-,"+i))^ . 

j=l 

In the above, we used the facts that the basis functions are nonnegative, their 
sum equals to the constant one function, and property (PI). ■ 

Lemma 4. If 

Aij ^ Mij + OAtKij <0, i^ j, i = l,...,N, j = l,...,N, (17) 

then Aq ^ > 0. Furthermore, the relations 

-Aq^Ao > 0, -Aq^Ao ea < eo (18) 







are valid. 
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Proof. Combining the positive definite property of Ao with the assump- 
tion of the lemma, we obtain that Aq is a non-singular M-matrix. This yields 
that the inverse of Aq is nonnegative. Because of relation (17), the matrix A^ 
is nonpositive, which implies — Aq ^A^ > 0. The matrix M is nonnegative, 
because the basis functions are nonnegative. Thus, 

0 ^ IVIe = (IVI ^Z\^K)0 = Ag = AqGo "b A^©^, 

and the last statement of the lemma can be obtained by multiplying both sides 
by Ao^B 

Now, we prove the main result of our paper, which presents a sufficient 
condition for the validity of DMP (cf. Theorem 1 in [5]). 

Theorem 1. Let the rectangular mesh IZh of Q he of non-narrow type and 
let the time increment At satisfy (16) and (17). Then the discrete maximum 
principle [relation (15)) is valid. 

Proof. Using (13), property (P2), the relation == (^^ follows 

from (PI)) and Lemma 2, we have 

Av"+i = Bv” + At < Bv",, + AtAS^::+^^ = + AtASt^'K 

(19) 

From (19), using the partition (12), multiplying both sides by Aq ^ (> 0, see 
Lemma 4), and regrouping the inequality, we get 

U^+l _^n_ ^ _A-1 Aa(g^+^ - vg - 

Obviously, 

e(g"+l - vg - = 5”+' - <ax - < 

< max{0, max{ 5 "+^ - <ax}}- (20) 

J 

Therefore, using (19)-(20) and Lemma 4, we get 

U"+1 - Vg - < max{0, max{ 5 ”+' - <ax}}eo- (21) 

J 

Writing (21) for the z-th component, and expressing we obtain the 

right-hand side inequality in (15). The left-hand side inequality in (15) can be 
proved in a similar manner. ■ 

The previous theorem does not say anything about the choice of the rect- 
angulation and the choice of the parameters 6 and At in order to guarantee 
the DMP. The validity of the DMP can be checked only after the direct calcu- 
lation of the elements of the matrices A and B by testing the two inequalities 
in the above theorem. The next theorem can guarantee the DMP a priori. 
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Theorem 2. The finite element solution of [l)-{2), using bilinear basis func- 
tions on a strictly non-narrow rectangular mesh IZh of a rectangular domain 
Q, satisfies the discrete maximum principle (15) if the conditions 





A2 






- 3%|a’ 


(22) 


and 


3(l-0)a’ 


(23) 


are fulfilled, where 







A max {\/^}, 7 = min < 



Proof. It is easy to show that under conditions (22)-(23), the sufficient 
conditions of the BMP are satisfied in Theorem 1 . The two inequalities in the 
theorem can be proven using condition (23) and (22), respectively. ■ 



4 Final comments 



In this paper a priori sufficient conditions for the validity of the discrete max- 
imum principle have been given for the Galerkin finite element methods based 
on bilinear finite elements in space. We close the paper with some remarks 
regarding our results. 

— As it usually happens in the qualitative analysis of finite element approxi- 
mations, there are both, upper and lower bounds for the time-step, which 
means that At cannot be chosen neither too small nor too large. 

— A square mesh with the mesh-size h is, obviously, strictly non-narrow. On 
such meshes, the sufficient conditions for the BMP are 



At > 



39a 



(24) 



and 



At < 



6(1 -6)a' 



(25) 



This shows that the time step can be chosen only for the values 0 > 2/3, 
i.e., the Crank-Nicolson scheme is not included. 

— The results of Theorem 1 are similar to Theorem 1 in [5] . The only difference 
is the application of the condition of the strict non-narrowness instead of 
the acute type condition. 
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Cubature-Differences Method for Singular 
Integro-differential Equations 



Alexander I. Fedotov 

Chebotarev Institute of Mathematics & Mechanics, Kazan, Russia fedotov@mi.ru 



Summary. In the papers [1] - [4] the quadrature-differences methods for the vari- 
ous classes of the 1-dimensional periodic singular integro-differential equations with 
Hilbert kernels were justified. The convergence of the methods was proved and er- 
ror estimates were obtained. Here we propose and justify the cubature-differences 
method for 2-dimensional ^ linear periodic singular integro-differential equations. 
Such equations appear in the theory of elastity (see [5]) and in some problems of 
diffraction of electromagnetic waves (see e.g. [6]) The convergence of the method is 
proved and error estimate is obtained. 



1 Statement of the problem 

Let’s define the sets N — N^, Z = Z^, R = R^, A = [-7r;7r]^. For the 
elements of this sets (2- components vectors) beside the usual operations we’ll 
define the following operations 

l.k = /iA:i + /2A:2, I ^ k = (hki, hk2), + |lh/i+/2, [1] = hk, 

and the partial order 

1 < k = (h < /ci)&(/2 < k2), 1 = (/l,/2), k = (/Ci,/C2). 

For the fixed s € R let’s denote by the Sobolev space of 2-dimensional 
27T-periodic complex- valued functions with the norm 

NU = IHk= = (E(i + i^TlS(k) |2)V^ 

kGZ 

where 

u{k) = J u{T)ei^{r)dT 

are the Fourier coefficients of the function u{t) to the system of trigonometric 
monomials 

ek(T) = exp(zk • r), k G Z, r G A. 



^ 2-dimensional case is considered only in sake of simplicity. All results could be 
easily generalised to the case of m (m > 3) dimensions. 
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For the following we’ll asume that s > 1 providing (see e.g. [ 8 ]) the embedding 
of into the space of continuous functions. 

Consider the linear singular integro-differential equation 

ABu -{-Tu = f, (1) 

where ^ is a 2 -dimensional singular integral operator 

Au = aoo{t)u(t) + < 2 oi(t)( Joiifc)(t) + aio(t)( Jiott)(t) -h an(t)( Jnifc)(t), 

A:H^ 

with the singular integrals 

(Joiu)(t) = ( 27 t)“^ j u{ti,T2)cot ^ ^^ dT2, 

(JlOM)(t) = ( 27 t )~^ j u{Ti,t2)cOt'^^-^dTl, 

( Jiiw)(t) = ( 27 t)~^ j j u{ti,T2) cot cot ^^-^d.T2dTi 

which are to be interpreted as the Cauchy-Lebesgue principal value, B is an 
elliptic differential operator 

Bu = {Bu){t) = ^ ^ me N, 

\OL\ = \f3\=m 



with derivatives 



d\^\u 



of order a = (ai, a2) G N, 



and T : 77 ^+ 2 "^ _ > ffs jg j^j^own linear operator. The coefficients a/ei(t), /c,/ = 
0 , 1 , b^p{t), \a\=\p\ = m, and the right-hand side /(t) of equation ( 1 ) are 
assumed to belong to . 



2 Calculation scheme 

Let’s fix n = (^1,722) G N, denote by 

In = Ifii ^ ^n 2 ') ^nj — \ ^ \ h j — L 

the index set and difine the grid 

— {tk — {^ki •) ^k2 ) 1 L G Iri5 ^kj ~ 5 hj = 27t/ ( 2/lj “h 1 ), J — 1 , 2}. 

on A. The approximate solution of equation ( 1 ) we’ll seek as a periodic grid 
function (vector of values) = t^n(t) defined on An- 
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The differential operators of equation (1) we’ll approximate by the 

operators 

( 2 ) 

where 

OjUn = hj ('Un(^ “1“ ^n(^))5 Bjll^i ~ (^n(t) 

Sj = (6ji,6j2), j = l,2, 
and 5jk is Kronecker symbol. 

Singular integrals are to be approximated by cubatures and quadratures. 
To do this we’ll integrate interpolative Lagrange polynomials 

kGin 

-pr sin((2nj +l)(Tj -4J/2) 

’ .hy2n, + l)sm((r,-tfcJ/2)’ 

T= (ti,T2) € a, tk = (ifcl.ifcs) € A„. 

Then the integrals will take the form 

(‘/01-Pn'i^n)(tk) = (2n2 + l)“^ 7l2-/2'^n(^/ci,^/2). (3) 

h€U2 

(‘^10'Pnrtn)(tk) = (2ni + 1) ^ ^ 

( JllPn^n)(tk) = [2n + 1]-1 ^ tk G An, 1 = (1, 1), 

1GI„ 

and the coefficients 7^^^ are 

(n^ f VTT rn ... 

The operator T we’ll approximate by any covergent operator T^. 
Substituting numerical differential formulas (2), cubature and quadrature 
sums (3), values of the coefficients aki{t)^ kj = 0, 1, | a | /3 |= m, 

of the operator (TnUn)(t) and right-hand side /(t) at the nodes of the grid 
An in equation (1) we’ll obtain a system of linear algebraic equations 

aoo(tk) ^a/3(tk)(-D“’'’'^Wn)(tk)+ 

\Ot\ = \f3\=m 



( 4 ) 
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\a\=\/3\=r. 



h^Ini \CX.\ = \f3\=m 

+ iXXXX E b^i^{ti„ti,){D^+^u^){ti„ti^)+ 

\CC\=\^l=m 

+ (Tn'Un)(tk) = /(tk)? tk G ^ri5 
of the cubature-differences method. 



3 Preliminaries 

Let’s denote by the set of grid functions (vectors of values) on An 
with the norm 

||«n|U,„ = ll^nks = ( E (1 + I 

kGin 

where 

= [2n + 1]“^ Un(ti)ek(ti), k G I„, 

lein 

are Fourier-Lagrange coefficients of the function Un(t) belonging to the grid An- 
The sets and will be mapped onto each other by the operators 

PnU = (u(tk))kGln 7 Pn • H > 

{PnUn){T) = ^ Mn(tk)Cn(T,tk), P„ : ^ , 

kei„ 

by En{u)s we denotethe best approximation of the function u e hy the 
trigonometrical polynomials of order not higher than n. 

Lemma 1. For any u € , s G 5 > 1 and n G N the following estimations 

are valid 

= 1, \\Pn\\H‘^H^ < 2M(n, s)i/C(2s - 1), 

\\PnPnU-u\\s < (1 + 2M(n,s)v'C(2s - l))En(u)s, 
where M(n,s) = ( rE(n^n| ) n G N, and <^{t) is Riemann’s (^-function. 




312 A.I. Fedotov 



To prove the convergence of the method we need the function M(n, s) to 
be bounded. Let’s for some c^s G R define the set 

N(c, s) = {n I n G N, M(n, s) < c}. 

Obviously, N(c, 5 ) == 0 for c < 2 ^/^ and N(c, s) == {n [ n = (j, j), j G N} for 
c = 2 ^/^. For the following we’ll mean that all indices n, no, ni mentioned 
below belong to N(c, s) for some c > 2^/^. 

Lemma 2. For any s <p, u G 

<(l + n2)(*-p)/2E„(«)p. 



4 Justification 

Theorem. Let for some c^sgR, 5 > 1, c> 2^/^ equation (1) and calculation 
scheme (2) -(4) of the method satisfy the following conditions: 

1) for any n the operator A maps the set of all trigonometric polynomials 
of order not higher than n to itself 

2) B is an elliptic operator, 

3) the operator T : is hounded for some e G R, £ > 0, 

4) the sequence of the the operators T„ approximates operator T with respect 
to pn, i-e. for any function u G : 

\\TnPnU - PnTu\\s^n = r]n\\u\\s-^2m with Pri ^ 0 fom OO, 

5) equation (1) has a unique solution u* G for any right-hand side 

feH\ 

Then for all n, beginning from some no, the system of equations (4) is 
uniquely solvable and approximate solutions u* converge to exact solution u* 
of equation (1) 

ll'^n Pn^ lls+2m,n ^ ^ ^ 

If in addition, li* G iJ^+ 2 m+ 2 ^ then the error estimate 

ll«n-PnW*|U+2m.n < C'(h^+»7n), h=(/li,/l2), /l, = 27r/(2nj + 1), j = 1, 2, 

is valid. 

Proof. Let’s take an arbitrary constant r G R which is not an eigenvalue 
of problem Bu -\-ru = 0, u G substitute into equation (1) 

v = Bu-\-ru, vgH^. (5) 

The existence of such a constant follows from the properties of the spectrum 
of elliptic operators (see e.g. [7]). Then 



u — Gv, Bu — v — rGv, 



( 6 ) 
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where G is the inverse to Bu + ru and equation (1) will take the form 

Kv = Av- vAGv + TGv = f, K : (7) 

being still equivalent to the original one. The equivalence hear means, that 
solvability of one of the equations yields solvability of the another, and their 
solutions are related by the relationships (5), (6). Now let’s rewrite the system 
of equations (4) as an operator equation 

AxiByyUyv ~\~ TnUn = /n, (8) 

An — PnAP n, fn — Pn/5 

{BnUn){t\^) = tk G An, 

\OL\ = \f3\=m 

and make the substitution 

Vn = BnUn “h ^'^n, '^n ^ ^n’ (^) 

As it is shown in [10] equation (9) is uniquely solvable for all n, beginning from 
some ni, and for Vn — Pn'^ solutions Un = GnVn = GnPn'^ converge to the 
solution u — Gv of equation (5). Here Gn is inverse to operator BnUn -f TUn 
and 

ILn — Gn'^n-i BnUn — 

By substitution (9) we’ll get equation 

KnVn ^ An^n “ vAnGnVn + Tn^n^n == /n, Kn ' ^ (11) 

which is equivalent to equation (8). As before, the equivalence here means, 
that solvability of one of equations yields solvability of the another and their 
solutions are related by the relationships (9), (10). 

The invertibility of the operators Kn • we’ll prove following [9]. 

To do this we have to establish the following: 

a) ||Pn/n - f\\s 0 for n ^ oo; 

b) the sequence of operators (Kn) approximates the operator K compactly; 

c) K is invertible. 

The validity of a) follows immediately from Lemma 1 ^ 

||Pn/n - /II. WPnPnf ~ f\\s < GEn{f)s- 

To check b) we have to show first that the sequence {Kn) approximates 
the operator K with respect to Pn, and then that for any bounded sequence 
('^^n), G iJn Sequence {PnKnVn — KPnVn) is compact in . 

For arbitrary Vn G we’ll write 

ll-Pn-fi^n^^n “ KPnVn\\s < H-Pn^nt'n - ^-PnWn|U+ (12) 

^ Here and further C denotes generic real positive constants, independent from n. 
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+ 1^1 ll^n^n^n'^n ~ ^^-Pn'^n||s + ||-fn^n^n'^n “ TGPnV^Ws 

and estimate each summand of the right-hand side. From the definition of the 
operator An and condition 1) of Theorem it follows that the first summand is 
equal to zero. For the second summand, using once more the definition of the 
operator An, condition 1) of Theorem and the boundness of the operators A 
and Pn, we’ll have 

I r I - AGPnVaWs < C||P„p„ylP„G„W„ - AGPnVn\\s < 

<G(|lG„t;„-p„GP„t;„|| s+2m,n n'^n)s+2m)* 

For the third summand, using Lemma 1 and the boundness of the operators 
Tn, we’ll obtain 

\\PnTnGnVn - TGPnVnWs < C{\\GnVn ~ PnGPnVn\\s-}-2m,n + 

-^\\TnPnGPnVn ~ PnTGPnVn\\s,n Pn{TGPnVn)s)' 

Finally, estimation (12) will take the form 

\\PnKnVn- KPnVnWs < C{\\GnVn ~ PnGPnVn\\s-{-2m,n + 

Ml'^nPnGPnVn ~ PnTGPnVn\\s,n + En{GPnVn) s-\-2m + En{TGPnVn) s) , 

which, taking into account condition 4) of Theorem, convergence of operators 
{Gn) and convergence to zero of the best approximations of functions GPnVn 
and TGPnVm means that 



lli^ni^n^n “ KPnVn\\s 0, 

and thus the approximation of operator K by sequence of operators (Kn) with 
respect to Pn- 

Let’s assume now, that sequence (un), Vn G is bounded ||unlls,n < L 
and prove that sequence {PnKnVn — KPnVn) is compact in . We’ll write 

PnKnVn ~ KPnVn = vAGPnVn ~ TGPnVn ~ rAPnGnVn + PrJ'nGnVn, 

and prove compactness of each summand of the right-hand side. Operators 
G:H^ rj^ . ^s+2m ^ ^ ^ . ^s+2m jjs+2m bounded, SO 

sequences {rAGPnVn) and (TGPnVn) are bounded in = min(2m,£) 

and thus compact in Operators Gn : and TnGn : — » 

are also bounded so polynomials PnGnVn and PnTnGn'^n are bounded in 
and thus, due to Riesz theorem, sequences (rAPnGnVn) and {PnTnGnVn) 
are also compact in P^, which gives the compactness of sequence (PnPn^n ~ 
KPnVn). 

Validity of c) follows from condition 5) of Theorem and equivalence of 
equations (1) and (7). 

Therefore, according to Theorem 6.1 [9], for all n, beginning from some no, 
no > ni, equations (11), (8), and thus system of equations (4) are uniquely 
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solvable and approximate solutions (?i*) of system of equations (4) converge 
to the exact solution of equation (1) with a rate 

ll^n -PnW*||5+2m,n < C\\pn{ABu* +Tu*) - < 

< C{En{Bu*)s + WVnBu* - BnPnU*\\s,n + Ibn^M* - TnPnU*\\s,n)- 

If, moreover, u* G then Bu* G and as it is shown in [9], 

\\PnBu* - BnPnU*\\s,n < C'b? . 

On the other hand, according to Lemma 2, and using the inequality (1 + 
n2)-9 < q&R, q > 0, we’ll have 

En{Bu*)s < (1 + n^)-^E^{Bu*)s+2 < C{h^), 

which, together with condition 4) of the Theorem gives desired estimation 

IK - PnU*\\s+2m,n < 0(h^ + TJn)- 

Hence, the Theorem is proved. 
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Summary. For the numerical solution of coupled problems on two nested domains, 
two meshes are used which are completely independent to each other. Especially in 
the case of a moving subdomain, this leads to a great flexibility for employing dif- 
ferent meshsizes, discretizations or model equations on the two domains. We present 
a general setting for these problems in terms of saddle point formulations, and in- 
vestigate one- and bi-directionally coupled applications. 



1 Introduction 

We consider coupled problems on two nested domains, the global domain Q 
and the subdomain cj, see Figure 1. In order to approximate the involved so- 
lution components on Q and a;, two meshes are used which are completely 
independent to each other. We like to be able to deal with different meshsizes, 




Fig. 1. Two nested domains (left), independent grids (right) 



discretizations and model equations on the two domains. Our approach is use- 
ful especially for a moving subdomain, i.e., when lu changes its position inside 
the global domain. In this case, no remeshing will be necessary and only the 
matrices responsible for the coupling have to be reassembled. In Section 2, we 
start with the general variational setting in terms of a saddle point formula- 
tion. A one- directionally coupled model problem is investigated in Section 3. In 
Section 4, we consider bi-directionally coupled formulations on the examples 
of a linear elasticity problem and an eddy current simulation. 
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2 Variational Setting 

Generalized saddle point problems. Our goal is to find a solution 
u = {uf 2 ,Uuj) consisting of two components defined on the global domain 
and on the subdomain a;, respectively. We denote by Vb and K; the appropri- 
ate weak function spaces for the solution components as well as for the test 
functions. Without taking into account any coupling between the two solu- 
tion components, the involved differential operators are in general described 
by continuous bilinear forms aj 7 (*, •) acting on Vf^ xVf^^ and ac^(-, •) acting on 
Vuj x\{j. Indicating by V the product space Vj 7 x K;, a composed bilinear form 
a(-, •) : y X F ^ IR is obtained by 

a{w,v) = ao{wf2,vo) ^ w,v eV. 

The coupling between the two solution components is realized via the Lagrange 
multiplier space M in terms of two continuous bilinear forms 6 i(*, •) and 62 (-, •) 
acting onVxM. For the applications in Section 3 and Subsection 4.2, M is the 
dual of the trace space i.e., M = whereas in Subsection 

4.1 M = with F = duo indicating the subdomain boundary. 

Solving additionally for the Lagrange multiplier p e M, the following gen- 
eralized saddle point problem is derived: find {u,p) eV x M such that 

a{u,v)-^bi{v,p) {f,v)v'xv, veV, 

h{u,q) ^{g,q)M'xM, q^M, 



where {'<,’)v'xV and (-,-)m'xM denote the usual duality pairings. We point 
out that for 6 i(-, •) = b2{'r)i problem ( 1 ) has the usual symmetric structure, 
which is encountered for example in the framework of mixed [ 1 ] and mortar 
[ 2 ] finite element methods. Moreover, if acts only either on Vq or VL, 

one- directionally coupled problems are derived. 

The bilinear forms &i(-, •) define coupling operators Bi : V M' and Bj : 
M hy (BiV,q)M'xM = {v,Bjq)vxV' = hi{v,q) for v € and q e M. 
The validation of the following coercivity- and inf-sup-conditions guarantees 
the unique solvability of problem ( 1 ) in F x M / KerBj ^ [5]: 



3q;o > 0 : sup II - *^0’ '*"0 e KerB 2 , 

uoGKerBi Hw'oHy |Po||v 



3ko > 0 : inf sup 



a{wo,vo) 
u)oSKerB 2 l|wo||y ||^’o||y 

bi{v,q) 



ll^lklklU/KerBT 



> ao, Vo € Ker Bi. 

> ko, i = 1,2. 



( 2 ) 

( 3 ) 

( 4 ) 



We note that the above conditions can be more relaxed [3, 11 ]. 
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Discretization. We use two different shape regular quasi-uniform triangula- 
tions Th on Q and 7^ on cj, as illustrated in Figure 1, with H and h indicating 
the corresponding maximum element diameter. The function spaces K;, 
and M are replaced by discrete approximations Vh C Vh C K;, and 
Mh C M, respectively. We denote an element {vn^Vh) of the product space 
Vj^ = Vh X Vh by . It may become necessary to involve approximate bilin- 
ear forms ah{'i •)? discrete saddle point formulation 

reads: find ,ph) G Vj^ x Mh such that 

ah{u^ ,v) +bi^h(v,Ph) = {f,v)v'xv, 

= {g,q)M'xM, q^Mh, 

If the conditions corresponding to (2)-(4) hold for a^(-, •), •), and 62, /i(-, •) 

with constants independent of the meshsizes H and /i, it is possible to derive 
optimal a priori estimates. This is a consequence of the next lemma, which 
follows from [3, Thm. 2.2]. 

Lemma 1. Under conditions (2)-(4), the following estimate holds with a con- 
stant C depending on ao , ko and the continuity constants of the involved bilin- 
ear forms: 



h-<\\v \\p-Ph\\M <C inf ||u-u||v + C inf \\p-q\\M 






q^Mh 



(a-ah){u,v) , ^ _ (bi-bi^h){v,p) , ^ (62 - &2, /.)(«, g) 

+G sup hC sup j— r hC sup fj-r . 

v€V« IPIIv vev" |plk qeMn ImM 



The most delicate step for the quality of the discretization and the compu- 
tational complexity is the information transfer between the two grids via the 
discrete Lagrange multiplier space M/^. In all our considered applications, we 
essentially couple between the global grid on f2 and the subdomain boundary 
T. On T, dual Lagrange multipliers [13] are used to approximate M, which 
have optimal stability and approximation properties. Moreover, they have lo- 
cal support and satisfy a biorthogonality relation with the basis functions of 
the trace space Vh\r- Therefore, the implementation of the corresponding op- 
erators Bi^h and B 2 ^h can be performed with low computational costs. 



3 A one-directionally coupled model problem 

We apply the framework presented in the last section to a one-directionally 
coupled model problem. We present a uniqueness proof and an a priori error 
estimate, which we confirm by a numerical example. 

Continuous formulation. Consider the problem 

-Auq = fn in C, UH\df 2 = 0, (6) 

with its associated bilinear form uq^wq^vq) \= for wq^vq G 

iJo(C). We want to solve an additional problem for the subdomain a;, namely. 
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-Au^ = in w, (7) 

with fuj = fn\uj’ Using Green’s formula, we obtain 

du 

duj{Uu^,Vu;) + (Uu;, ~^)m'xM = ifuj,Vu;)uJ, G , 

with the obvious meanings for au;(-, •) and (•, -)a;, and where M = 

Introducing the Lagrange multiplier p = , we find that 

bi{v,q) = {vuj,q)M'xM for v = {vq^v^j) G U = H^^Q) x q e M.We 

realize the continuity requirement along the boundary T in (7) by the bilinear 
form b 2 {v,q) = {vq — Vuj^q)M'xM, ^ U, q G M and obtain the saddle point 
formulation of (6), (7) given in (1) with p — 0. 

It is obvious that problem (1) has a unique solution, since the problem 
on the global domain i? is not influenced by the problem on the subdomain 
a;, and its solution Uf} yields the boundary data for a well posed subdomain 
problem. Nevertheless, we provide a complete proof within the saddle point 
setting. 

Theorem 1. With the above definitions, problem (1) is uniquely solvable. 

Proof. We first show the unique solvability of problem (1) by validating the 
conditions (2)-(4). Our main tool is the harmonic extension operator H : 
M' K;, defined by 

au{Hw,Vu) = 0, Vu eV° = Hq{uj), (Ww)|r = w. (8) 

We observe that the trace of K; onto P is the space M, and that 6i((0, v),q) == 
b2{{^j —'^)^q)- Taking v = (0,±Wu;), w G M', condition (4) is a consequence 
of the definition of the iJ~^/^-norm and of the fact that ||7Yu;||i,u; < 

Let us focus on condition (2). The kernels of the coupling operators are 
Ker^i = VqX V^, and Ker ^2 — {v £V : tiVQ = where tr : H^{f2) 

jyi/2(p) denotes the trace operator. We uniquely decompose v^j ^ into 
such that vb = W(u^|r) and vj G V^. For an arbitrary i(;o = {wf 2 ,vuB-\- 
wj) G Ker ^25 we consider vq = {wq,wi) G Ker^i. By using the properties of 
the harmonic extension, we get 

\\wn\\la > c (/f, tr-wn^ + c\wn\ln > c (/^ + c\wb\1^ > c\\wb\\1^. 

( 9 ) 

Condition (2) follows from (9): 

a{wo,vo) = ao{wQ,wn) +a^{wi,wi) > c\\wn\\l^^ + c\\wi\\l^^ > cHwolliUt^lQ) 

The proof of condition (3) is similar. For an arbitrary vq = {vq^vi) G KerBi, 
we set wq = {vQ,'H{tiiVQ) + f/) G KerB 2 , and obtain (10). 
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Discretization. We use standard conforming finite elements of order r and s 
on Tj{ and 7^, respectively. The associated discrete spaces with no boundary 
conditions are denoted by and and we set 

i7o(i?) and Sq ^{uj) = Sf^{co) fl Hq{uj) to be the spaces taking into account 
homogeneous Dirichlet conditions on DO and T, respectively. For the discrete 
Lagrange multiplier space M/^, we propose the use of dual basis functions 
[13] adapted to the order s of elements from the trace space of which 

is indicated by W^{r). Setting Vj^ = Sqjj{0) x the discrete saddle 

point problem (5) is obtained. The unique solvability of problem (5) can be 
shown by replacing the harmonic extension H and the trace operator tr in 
the proof of Theorem 1 by discrete operators Hh ' W^{F) 
tr^ : Sq W^{r), respectively. In order to obtain the estimate (9), 

these operators have to satisfy certain stability and extension properties with 
respect to the i7^/^-norm. The discrete harmonic extension Tih is naturally 
obtained by taking 5 q^(o;) as a test function space in (8), and the operator 
tr h is given by the mortar projection associated with the discrete Lagrange 
multiplier space Mh^ in particular, for this choice, we find Jj^wq = JptrhWQ. 

We intend to use a smaller meshsize /i < or a higher order s > r on 
the subdomain, and, therefore, expect a better solution Uh compared to 
Thus, the finite element solution is defined by 

f Ufj m (jj^ = f2 \cJ. 

'^FE •= i 

[ Uh m LO. 

Lemma 1 only provides a global estimate, which is not sufficient here, since we 
like to disregard the approximate solution component uh on the subdomain 
LO. The necessary tools for a more local analysis can be found in [12], resulting 
in the following estimate which is proved in [7]. 

Theorem 2. Let B D such that d = dist {dB \ dQ^ dco^ \ dO) > 0. Then 
for H small enough and u regular enough, there exists a constant C depending 
on d such that 

+ 11'^ — ^ Ch^\u\s-\-l^u) + CH'^\u\rJ^l^B + l'^lr+1,^?* 

( 11 ) 

We note that the last term in (11) is the fundamental difference of our approach 
to the estimates obtained by standard adaptive finite element methods. It is 
due to the fact that in our one-directionally coupled approach no pollution 
effect is taken into account. 

Numerical test. Consider the model problem (6) on Q := (0, 1)^ with source 
term / derived from the exact solution u{x, y) := exp(— 100((x — 0.6)^/a^ + (2/ — 
0. 5)^/6^)). An elliptic patch with radii 0.25 and 0.15 is placed in the domain 
Q with its center at (0.6, 0.5), as illustrated in Figure 1. Since the solution 
goes to zero with an exponential decay, we may have a coarser triangulation 
far enough away from (0.6, 0.5). Therefore, we choose an initial triangulation 
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with h/H = 1/4. We use PI elements on Th, whereas on 7^, we consider two 
different cases and use PI elements for one test, and P2 elements for another 
test. Figure 2 shows the decay of the errors ch = u — uh and epE = u — in 
the P^-norm under uniform refinement. The errors en and epE both satisfy the 





degrees of freedom degrees of freedom 

Fig. 2. Error decay in the P^-norm of Pl-Pl and P1-P2 coupling 

a priori estimates. Choosing the same number of unknowns for the standard 
and the overlapping method, the solution obtained by the Pl-Pl coupling is 
significantly better than the solution obtained by the standard method. For 
the P2-P1 coupling, the error decay is almost optimal with respect to the 
piecewise quadratic finite elements used on 7^. In agreement with (11), the 
error behaves like -t- C 2 P, and, numerically, C 2 ci. 

4 Bi-directionally coupled problems 

We present two applications which result in bi-directionally coupled problems. 
The first one illustrates a complementary coupling procedure, the second one 
considers an eddy current problem. 

4.1 Natural boundary conditions at the hole u 

We want to solve a boundary value problem on the domain f2 \lj =: with 

natural boundary conditions on the hole boundary P. The solution on the 
global domain Q yields the solution on the domain with hole This problem 
is analyzed for the linear elasticity setting in [9]. In addition, we show an 
application for rotating “holes” , see Figure 4. 

Saddle point formulation for the linear elasticity problem. We con- 
sider the problem: find Uc G such that 

-Divcr(iic) = /c in 

(j{uc) ric = t on P, 

with Uc — 0 on Po and a(uc) ric — g on Pi where Pq C dQ has a positive 
measure and dQ = Pq U Pi, Pq Pi Pi = 0, with body forces fc G and 
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surface tractions g G t G (L^(jT))^. An equivalent formulation is 

given by 



-D\v(r{un) = fn in 

— Div cr(ui^) = fu in u>, 

U {2 = Uw on r, 

[a{ua) ric] = -cr(u„) ric + t on F, 

with uq = 0 on /o and a{uo)nc = S' on Fi, and where := — 

wq\^, ^ = fn\^ and fa € {L?{0)) is an extension of /c to Q. It can be 
written in its weak formulation as saddle point problem with the bilinear forms 
clq{uq,vq) /q(7{uq) \ e{vQ),a^{u^,v^) = \ e{vuj),hi{v,q) 

Vf 2 , q)MxM', h{y,q) = {vuj - Vf 2 , q)MxM', and V = x 

M = . For the discretization, we use linear and quadratic finite 

elements on quasi- uniform and shape regular triangulations. The conditions 
(2) - (4) hold for h/H small enough in the discrete setting as well as in the 
continuous setting, yielding unique solvability. For details, we refer to [9]. 

The realization of this approach allows for an easy shift of the hole without 
having to remesh and can be used in shape optimization algorithms to deter- 
mine an optimal hole position. Note that the quantity we pass back from the 
hole to the background is the jump in the fluxes, i.e., in general the solution 
Uf 2 is only iJ 2 “^-regular, s > 0. Another application of this complementary 
coupling technique are time dependent problems where the hole is an object 
moving through the domain i? emanating some flux into uj^. 

Numerical examples 

Beam with one hole. We consider the problem domain = (—5, 5) x (0, 1) \ 
{(x,y)eR2| 

\\{x,y) — (—1,0.5)11 < 0.3} with Uc = 0 for x = ±5, a{uc)nc = (0,-1) for 
y = ct{uc) Uc = 0 elsewhere and f = 0. We use Young’s modulus E = 200 
and Poisson ratio u = 0.3. The stress a^x is monitored as a graphical quantity, 
see Figures 3. The iterative solver is based on a block GauB-Seidel method for 
the symmetric positive definite system arising from static condensation of the 
Lagrange multiplier. The convergence rates are level independent, see [9]. 



I 



n n I LJ. L-.L.L.OI 

o 





+ 



Fig. 3. Top: Problem setup and start grids. Bottom: axx on i? x a; using comple- 
mentary coupling; we show only the values on a;^^. 
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Rotating hole. The hole domain is a smooth star with five points described by 
(x, y) = (rux^ ruy) 4- + Ar) cos(lOTrA) (cos(a(A, t)), sin(a(A, t))) with A, t G 

[0, 1] and a(A, t) = 27t(A — t — cos(107t(A + |)/51)), and a center {nix^ rUy) = 
(1.1,0.65)), the medium radius = 0.2 and the radius change amplitude 
Ar = 0.05. We solve the Poisson problem. The boundary segment described 
by A G (0,0.1) carries non-zero natural data. The background region is i? = 
(0, 3) X (0, 1). The situation and the solution at t = 0, 0.15, 0.3, 0.45, 0.65, 0.85 
can be found in Figure 4. For each new position of the star, no remeshing 
procedure has to be carried out, simply the existent grid has to be rotated. 
Moreover, only the coupling matrix has to be reassembled, all other involved 
matrices stay the same throughout the whole computation. 




Fig. 4. Problem setup (top left): Zero natural b.c. in hatched areas, 0 and 1 Dirichlet 
b.c. at the top left and bottom right side, respectively, influx of 5 (natural b.c.) 
at the back side of the first wing. Initial background grid plus rotated star grid 
at t = 0,0.15,0.3,0.45 (top right). Bottom: Solutions at different times t: t — 0 
(complete), t = 0.15, 0.3, 0.45, 0.65, 0.85 (partly displayed). 



4.2 Eddy current simulation 

We want to approximate the eddy currents inside a conductor uj which is ex- 
posed to a time dependent electromagnetic field acting in the global domain 
Q. A detailed problem description and analysis concerning the statically con- 
densated elliptic system can be found in [8], numerical results are available in 
[6]. Here, we present an alternative approach which fits into the saddle point 
framework presented in Section 2. 

Saddle point formulation. Elimination of the other involved field quantities ’ 
from the quasistationary Maxwell equations yields for the magnetic field H 

div fiH = 0 in (12) 

dtH + — curl curliJ == 0 in cj, (13) 

aji 

with positive material parameters [i and cr, and /i constant in to. We assume 
knowing a source vector potential Tg such that curlT^ = Jg in with 
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Js a given source current density. The magnetic field H is decomposed into 
T — grad0 on to and — grad(/> on where T G is a vector valued 

potential defined on the conductor a;, and (f) G is a scalar valued po- 

tential defined on the global domain i7. Moreover, we use the Coulomb gauge, 
i.e., T is chosen to be solenoidal. From (12), we obtain 

an{4>,v) - f {Tn)v^ f /JT^gradv, v£Vn=HQ{Q), (14) 
Jr Juj^ 

with aa{w,v) = /? grad u; grad and (3 depending on fi. Taking v G 

(14) implies that (j) is harmonic on cj, thus, there exists j = (f)^p e 
such that (j)\^ = W7 with the harmonic extension operator T~C : i7^/^(T) 

Furthermore, due to the solenoidality of T, it holds that f^{Tn)X — 
J^T gradWA — 0 for an arbitrary A G iJ^/^(T). After time discretization by 
an implicit Euler scheme with time step size At, we obtain from (13) at each 
time step: 

aUil,T),{X,W))+ f (Tn)A= / UW, {X,W) € 

Jr Joj 

(15) 

with 

a^{{^,T),{X,W)) = f acuTlTcuxlW+TW-Wgra.dnj-TgradnX, (16) 

J UJ 

where a = At/{iia), and contains the information from the preceding time 
step. 

This suggests the introduction of the Lagrange multiplier p = Tn e M = 
jj-i/ 2(2 ''), q£ coupling bilinear form b{{v, X,W),q) = (A — v,q)M'xM 
for {v, A, W) e V = Vq X Vuj and q e M. Setting ^ 0 and 6i(-, •) = &2(*, *) = 

6(-,‘), problem (1) is obtained. By choosing (A, VF) == (0, gradt^), v G Hq{lu), 
in (15), it is easy to see that the solenoidality of T is guaranteed provided 
that fuj is divergence free. The unique solvability of the statically condensated 
formulation of problem (1) is proved in [8]. 

Discretization. For the approximation of (j), piecewise linear finite elements 
are used on Th- Concerning the vector potential T, we employ curl-conforming 
edge elements [10] on 7^, which are ideally suited for the approximation of 
whereas for 7, we use piecewise linear finite elements on F. As before, 
we approximate the Lagrange multiplier space M by dual basis functions. 
In contrast to the preceding applications, the bilinear form a(-,-) cannot be 
implemented directly, and we need an approximation a/i(-,-). Therefore, the 
harmonic extension operator W in (16) is replaced by its discrete analogue Tin 
corresponding to piecewise linear finite elements on 7^. The gradient operator 
can be easily realized by the node-to-edge incidence matrix G which acts on 
the degrees of freedom associated with the linear elements on Th, and gives the 
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gradient as a linear combination of the basis functions for the edge element 
space [4] . An optimal a priori estimate based on Lemma 1 for the finite element 
solution {(j)H’,Th) of the statically condensated form of (5) is obtained in [8], 
provided that the ratio h/H is small enough. 
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Summary. Multi-step quasi-Newton methods for optimization employ, at each it- 
eration, an interpolating polynomial in the variable space to construct a multi-step 
version of the well-known Secant Equation (the relation which constrains the updat- 
ing of the Hessian approximation) . There is some freedom in the choice of the inter- 
polating polynomial and this freedom is exploited, in the case of two-step methods, 
by the so-called “Minimum Curvature” algorithms, which produce the ‘smoothest’ 
interpolation, in the sense of obtaining the polynomial with the smallest possible 
second derivative (measured in some suitable norm). Typically, these norms are de- 
fined by a positive-definite matrix and, in this paper, we will consider and compare 
the use of different matrices in defining the norm. In particular, we will describe 
the construction of implicit methods, in which, as we will demonstrate, there is no 
requirement to compute the matrix defining the norm explicitly. 

1 Introduction 

We consider two-step quasi-Newton methods for the unconstrained optimiza- 
tion problem 

min f{x) {x G R^). 

If we denote the gradient and Hessian of / by p and G, respectively, then 
such methods are organised in a manner that closely resembles the structure 
of standard quasi-Newton methods, except that the approximation Bi^i to 
the Hessian G{x^_^i) is required to satisfy a condition of the form (where 7^ is 
a scalar determined by the precise variant of the method under consideration) 

= (y. - or (1) 

Bi+iLi = m, say, (2) 

in place of the usual condition (known as the Secant Equation) 

Bi+iSi = y.. (3) 

(In equations (1) and (3), the step- vectors Sj and are defined by 

def 

Ij 5(%+i) - g(Xj) = - gj , say, 



( 4 ) 

( 5 ) 




Implicit Updates in Minimum Curvature Methods 327 



where {xj} are the successive iterates produced by the method.) A matrix 
satisfying (l)/(2) can be constructed by appropriately modifying, for example, 
the BFGS update formula, as follows: 



Bi^i — Bi — 






rjBir, 



+ 



T 

Wj Va 



def 



BFGS{Bi,r^,w^), say. 



( 6 ) 

(7) 



The derivation of the condition (1) is described by Ford and Moghrabi [5, 4]. 
In short, quadratic curves x{r) and g(r) in (where r E R) are constructed 
which interpolate respectively, for the same set of values of r, the three most 
recent iterates x^_x, x^ and and the three associated gradient evaluations 
(assumed to be known). The derivatives of these two curves (at r — T 2 ^ where 
T 2 is the value of r corresponding to x^j^i and g{x^j^i) on the respective curves) 
are then substituted into the relation 



Gfe+i)^'(r2) = ^{ x { t 2 )) , (8) 

derived by applying the Chain Rule to the function g{x{r)). (In (8), primes 
denote differentiation with respect to r.) Of course, 

W.i‘^= (9) 

will, in general, only be an approximation to the vector g'{x{r 2 )) required in 
(8), whereas 

Li‘^=x'{T2) ( 10 ) 

may be computed exactly. Nevertheless, on making these substitutions into 

(8) and removing a common scaling factor, a relation of the form (1) for 

Bi^i ^ G{x^_^i) is obtained. 

The remainder of this paper is organised as follows: in Section 2, we review 
the “minimum curvature” approach to determining a suitable set of parame- 
ters 5 while Section 3 describes the concept of implicit updates. Section 

4 then develops the use of implicit updates within “minimum curvature” meth- 
ods. Finally, we present the results of numerical experiments in Section 5, and 
Section 6 draws conclusions on the basis of the results. 



2 Minimum curvature methods 

If the values of r corresponding to g{x^_i) and x^/g{xi) are denoted by 
To and Ti respectively, then Ford and Moghrabi [6] observed that, without loss 
of generality, the values tq = 0 and t 2 = I could be specified, leaving the 
remaining value to be chosen according to suitable criteria. Defining the 
related quantity 

S ={t 2- - To) = (1 - Ti)/ti =» Ti = (1 + 



( 11 ) 
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they then discussed choosing the parameter S to minimize the norm \\ x"(r) ||m, 
where a f rr. 

II 1 II M = (12) 



and where M is a given symmetric positive-definite matrix, thus producing 
an interpolating curve x{r) which is the “smoothest” in the sense of the norm 
II . ||m- It was shown in [6] that fulfilling this criterion leads to the requirement 
of solving the cubic polynomial 






at each iteration, where 



def|| ||2 

— II Wm'i 

def / 



+ IJ.6 — a 


(13) 




(14) 


def / 


(15) 



In [6] , the properties of the polynomial were analyzed and it was shown that 
its zeroes may be determined efficiently. In addition, the circumstances under 
which 'ijj has three real zeroes (rather than only one) were identified and it was 
shown (in that case) which zero should be selected to yield the lowest curvature. 
It was observed that, in general, this approach was capable (depending on the 
relative dispositions of the three iterates x^ and x^j^i) of producing all 

three essentially different orderings of the iterates on the interpolating curve. 

In their original paper [6] on these “minimum curvature” methods, the 
authors reported the results of numerical experiments conducted with the 
version of the algorithm (called A) obtained by making the straightforward 
choice M = I, and showed that this yielded a substantial improvement in per- 
formance, when compared with the standard BFGS method. In a subsequent 
paper [7], they investigated the performance of the algorithm, called B, arising 
from choosing M = Bi (this choice being motivated by the previous success 
of other multi-step methods employing the same matrix), and showed that 
a further improvement in performance was thereby obtained. In this paper, we 
will pursue this avenue of investigation further by considering related choices 
for the matrix M. In particular, we will consider the use of implicit updates - 
that is, updated forms of the matrix Bi which are not calculated explicitly. 



3 Implicit updates 

Since the choice M = B{ produces an algorithm with good numerical perfor- 
mance, a natural question to consider is whether related matrices might yield 
further gains. An obvious line of enquiry to pursue in answering this question 
is the use of updated versions of Bi^ where the update employs data from 
the most recent iteration(s). Because this updated matrix (call it Bi for the 
present) will be used to compute hence and Wi, it cannot 
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be the matrix Bi^i which will be produced via equation (6). Thus, it appears 
that use of such a matrix would necessitate carrying out a second update (re- 
quiring 0{n?) operations) at each iteration. However, we observe that explicit 
knowledge of the updated matrix is not our real goal - rather, we only need this 
matrix to enable us to calculate cr^, and /i^ and hence (by solving the cubic 
polynomial) 6, Ji, Li w.i- Therefore, if it can be shown that the expressions 
required in equations (14) and (15) may be computed at low cost without 
explicit calculation of Bi, then we will have gained our objective of using an 
updated matrix, while avoiding most of the computational expense. Because 
explicit computation of Bi is avoided, methods which use this technique were 
termed implicit methods in [3]. 



4 Implicit updates in minimum curvature methods 

For the purposes of simplicity and of easing the computational requirements, 
we will only consider here single-step implicit updates of Bi (although use of 
two-step updates is an obvious further line of research). We therefore propose 
the following matrices for use in the norm II • IIm, denoting the algorithms 
thus defined by C and D: 



C: BFGS{Bi,s,_^,y._^) 


(16) 


T>: BFGS{Bi,Si,y.). 


(17) 



In order to avoid the explicit computation of these updated matrices (for the 
reasons explained above) , it is necessary to show (in the context of “minimum 
curvature” methods) how the quantities ai-i and fii can be calculated with- 
out explicit knowledge of the matrix. We consider this issue for each update, 
in turn. 



4.1 Update C 



Since the matrix Bi-i is constructed by means of a standard single-step BFGS 
update, it follows immediately from (16) that 



= y._^. 

On the other hand, we can use equations (6) and (7) to show that 



Bi-is^ = <Bi 



sl,BiSi_, 



+ 






= -Ug, + ti 



T 

s-t-ig, 






(BiSi_i) + 



' T 
V 



(18) 



»i-i‘ 



(19) 




330 J.A. Ford, LA. Moghrabi 



(In deriving (19), we have assumed that is obtained by some form of 
search along the ‘quasi-Newton’ direction 

Ei = (20) 

which implies that 

BiS^ = -Ug., (21) 



for some positive scalar ti.) Thus we are able to derive the following expressions 
for the quantities cr^, ai-i and gi\ 

(22) 






T 



■i-V 



o-< = -tisf 9 i - tf 

Mi = dEi_r 






+ 



(CifiLl. 

J ’ 



(23) 

(24) 



4.2 Update D 



In a similar manner, we can derive the following expressions for cr^, ai-i and 
gi in the case when the implicit update (17) is applied: 

''(sf_i5)2'| f(sf_iy)2) 



<^i-i = +U 



T 



+ 



T 

s.i y, 



(25) 



= dy,; 

= d-iUi- 



(26) 

(27) 



It is evident, from equations (23) and (25), that there remains one obstacle 
to be overcome, in each case, in achieving our goal of avoiding terms requiring 
C^(n^) computation in the calculation of cr = ai/ui-\. That difficulty resides 
in the computation of the product This problem has been tackled 

in two ways - by alternation of updates and by use of a recurrence. 



4.3 Alternation 

Alternation [8] involves the repeated application of a cycle consisting (in its 
basic form) of two iterations, the first of which is a standard single-step BFGS 
iteration and the second of which is the required two-step method. The conse- 
quence of this arrangement is that each two-step iteration can be implemented 
in the knowledge that the relation 

= Mi_i (28) 

holds (because of the preceding single-step iteration). This implies that 
which means that can be computed in 0{n) operations. 



( 29 ) 
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4.4 Recurrence 



In [9] , the authors showed how the following recurrence for the efficient calcu- 
lation of the quantity Xi could be derived: 



7Ti_i = +7f_iAi_i - ; 

. T ^2 ^2 T 



Xi = 









Li-im-i 



i-1' 



'^i—1 



where 



(30) 



= Si_i - 7i_iSi_2 , and (31) 

= Mi- i-O'i- 1^,-2 • (^2) 

Again, only 0{n) operations are required in order to obtain Aj = 



4.5 New methods 

In fact, the algorithm denoted by B was developed before the derivation of 
the recurrence (30), so we have a total of five new algorithms to compare with 
the existing methods BFGS, A (using M — I) and B (using M = Bi and 
alternation). For consistency of notation, we will now rename B as Ba^t, since 
it uses alternation. The five new algorithms are therefore 

1. ^recur (usffig M = Bi and the recurrence (30)) 

2. Cait (using M = Bi-i and alternation) 

3. Crecur (usffig M = Bi-i and the recurrence (30)) 

4. Da/t (using M = Bi^i and alternation) 

5. T) recur (usffig M = Bi^i and the recurrence (30)). 



5 Numerical experiments 



The algorithms Cait, Crecur, F>ait and Drecur derived from the new implicit 
updates were compared with each other, in our first set of experiments. All the 
multi-step algorithms tested in these and the following experiments employed 
the BFGS formula to update the inverse Hessian approximations Hi == 
but with the usual vectors and y. replaced by r- and w^: 



+ I 1 + 



njHjW, 

rjwi 



T 

r-rf 



T 

rj w. 



HiW.rJ +r,vr[Hi 



rim 



(33) 



The line-search employed by all the algorithms was an implementation of safe- 
guarded cubic interpolation and was required to produce a point satisfying 
the following standard stability conditions (see Fletcher [2], for example): 
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/fe+i) < f{Xi) + 10 '^sjg{xi) ; (34) 

dgi^i+i) > 0.9s[g{Xi) . (35) 

The suite of test functions used in the experiments is as described in Ford 
and Moghrabi [4] , with a small number of modifications to starting-points and 
convergence criteria. The suite comprises 107 functions, in all, and includes 
many test functions which are well-documented in the literature (for example. 
More , Garbow and Hillstrom [11], Conn et al. [1] and Toint [12]), such as the 
Penalty I and II functions, the Chained Wood function, the Discrete Boundary- 
Value function, the Engvall function, the Discrete Integral Equation function 
and the Gragg-Levy function. The dimensions of these test problems ranged 
from 2 to 80. For each function, four starting-points were used, giving a total of 
428 test problems. For convenience, the functions were notionally classified into 
those of ‘low’ (2 < n < 15), ‘medium’ (16 < n < 45) and ‘high’ (46 < n < 80) 
dimension. In total, there were 29 functions in the ‘low’ set, 43 in the ’medium’ 
set and 35 in the ‘high’ set, giving 116, 172 and 140 test problems in the 
respective sets. 



Table 1. Comparison of four new minimum curvature methods 



Problem set 


Low 


Medium 


High 


Combined 


Gait 


20739 (15148) 


32634 (28769) 


25085 (23239) 


78458 (67156) 


Scores 


45 


84 


71 


200 


^^recur 


21820 (15659) 


35036 (30606) 


24561 (22926) 


81417 (69191) 


Scores 


38 


73 


64 


175 


DaZt 


20898 (15560) 


37327 (33663) 


29485 (27933) 


87710 (77156) 


Scores 


31 


17 


8 


56 


rscur 


21172 (15299) 


36247 (32237) 


27571 (26001) 


84990 (73537) 


Scores 


31 


10 


6 


47 



Results from these first experiments are summarized in Table 1. Each of 
the tables which we present is divided into five columns, three of which cor- 
respond to the subsets of functions referred to above, while the last refers to 
the complete set. The main entry for a method in each column gives the total 
number of function / gradient evaluations required by that method to solve 
all the problems in the specified set, followed by the total number of iterations 
(in brackets). A ‘best performance’ for each problem was decided on the basis 
of the lowest number of evaluations, with ties resolved by the number of iter- 
ations. The row labelled ‘Scores’ shows the number of best performances by 
each method for the relevant set. 

It is evident, from Table 1, that methods based on the implicit update 
D are not competitive with methods based on C. (Comparison with Table 2 
below shows that they are not even competitive, on the problems with highest 
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Table 2. Comparison of new and old minimum curvature methods 



Problem 

set 


Low 


Medium 


High 


Combined 


BFGS 

Ratios 


21269 (16073) 
100.0% (100.0%) 


42202 (38478) 
100.0% (100.0%) 


33549 (32192) 
100.0% (100.0%) 


97020 (86743) 
100.0% (100.0%) 


A 

Ratios 

Scores 


21053 (15300) 
99.0% (95.2%) 

30 


32769 (28898)* 
77.6% (75.1%)* 

17 


25721 (23908) 
76.7% (74.3%) 

12 


79543 (68106)* 
82.0% (78.5%)* 

59 


Bait 

Ratios 

Scores 


21201 (15557) 
99.7% (96.8%) 

31 


34848 (30625) 
82.6% (79.6%) 

55 


25083 (23306) 
74.8% (72.4%) 

53 


81132 (69488) 
83.6% (80.1%) 

139 


Brecur 

Ratios 

Scores 


22015 (15869) 
103.5% (98.7%) 

18 


36646 (31840) 
86.8% (82.7%) 

21 


24853 (23152) 
74.1% (71.9%) 

16 


83514 (70861) 
86.1% (81.7%) 
55 


Galt 

Ratios 

Scores 


20739 (15148) 
97.5% (94.2%) 

35 


32634 (28769) 
77.3% (74.8%) 

77 


25085 (23239) 
74.8% (72.2%) 

53 


78458 (67156) 
80.9% (77.4%) 

165 


G^recuv 

Ratios 

Scores 


21820 (15659) 
102.6% (97.4%) 

29 


35036 (30606) 
83.0% (79.5%) 

56 


24561 (22926) 
73.2% (71.2%) 

48 


81417 (69191) 
83.9% (79.8%) 

133 



dimension, with the original “minimum curvature” method A.) Although this 
result may be somewhat surprising (it might have been expected, instead, 
that an update employing the ‘latest’ data would be more successful still than 
the method using M = Bi, let alone the method A), we point out that it is 
consistent with the results obtained by Ford and Tharmlikit [10] when using 
a similar implicit update. On the basis of the results reported in Table 1, the 
methods 'Dait and T>recur will not be considered further here. 

A second set of experiments (using the same test functions) was then con- 
ducted, in order to compare the more successful new methods Cait and Crecur 
with the existing “minimum curvature” methods A and Ba/t, and the new 
version ^recur of B. These results are reported in Table 2. For comparison, 
we have also included the results returned by the standard single-step BFGS 
method. In this Table, the entries in each ‘Ratios’ row give the proportions 
of evaluations and (in brackets) iterations for that method, expressed as per- 
centages of the corresponding figures for the BFGS method. Finally, scores 
(indicating ‘best performances’) are recorded again for this set of experiments, 
but we point out that the results returned by the BFGS method were not 
included in assessing these best performances, since our primary purpose is to 
compare the “minimum curvature” methods. (The notation ^ placed against 
two of the results for the method A indicate that there was one failure for this 
method [for a test problem in the Medium category] , where it was unable to 
converge to an acceptable minimum within the permitted limit of 5000 iter- 
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ations. The evaluations and iterations incurred on this failure have not been 
included in the relevant totals.) 

6 Summary and conclusions 

It has been shown how the technique of “implicit updates” may be embedded 
within the “minimum curvature” approach to determining smooth interpola- 
tory polynomials for use in two-step quasi-Newton methods and, further, how 
this embedding can be carried out in a computationally inexpensive manner 
(without, for example, requiring any additional O(n^) quasi-Newton updates). 
Numerical experiments have demonstrated that methods based on the implicit 
update D are not competitive with other “minimum curvature” methods. Fur- 
ther experiments have shown that some “minimum curvature” methods are 
capable of out-performing the standard BFGS method on higher-dimension 
problems by as much as 27 - 29% in terms of both function / gradient eval- 
uations and iterations. On the basis of total evaluations, total iteration and 
scores, the most successful method is the alternating implicit method Gait in- 
troduced in this paper. Its nearest competitors are 'Bait and Grecur- We also 
note that methods based on the recurrence (30) tend to perform a little less 
effectively than the corresponding methods employing alternation. The alter- 
nating methods are, in addition, a little cheaper to operate, because they do 
not need to compute the recurrence and because they only need to solve the 
“minimum curvature” sub-problem once in every second iteration, instead of 
on each iteration. 
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Summary. We study boundary movement identification for a parabolic partial 
differential equation describing a dynamic diffusion process, on basis of internally 
recorded data. Formulated as a sideways diffusion equation, the problem is treated 
by a spatial continuation technique to extend the solution to a known boundary 
condition at the desired boundary position. Recording the positions traversed in the 
continuation for each time instant yields the boundary position trajectory and hence 
the solution of the identification problem. As the problem is ill-posed, a hyperbolic 
approximation approach is used to regularize the computation and recast the equa- 
tions into a form amenable to analysis. 



1 Introduction 

A common feature of inverse problems is their objective of determining the 
“cause from the effects”. A consequence of this is the ill-posed mathemat- 
ical nature [1] of such formulations. This means that they do not satisfy 
Hadamard’s definition of well-posedness: (i) For all admissible data, a solu- 
tion exists, (ii) For all admissible data, the solution is unique, (iii) The solu- 
tion depends continuously on the data. Practical consequences of ill-posedness 
are strong error growth in the computation and the fact that approximation 
errors as well as measurement noise cause blow up of results. Hence, special 
regularization techniques are necessary for stabilizing the computations. 

Here, estimation of boundary position from real process data is considered 
based on a regularized, slowly divergent space marching [6] method. Possible 
approaches for this have been adopted for the sideways heat equation [2], in- 
cluding sequential and entire time-domain computation by output-norm mini- 
mization as well as direct methods. An entire time-domain direct computation 
method will be demonstrated by application to simulated input data contam- 
inated by random noise. If necessary, nonlinearities through variation (tem- 
perature, time or geometry) of material properties and mixed (Robin type) 
boundary conditions are readily included. The integrated regularization and 
straightforward space marching algorithm makes the method especially suited 
to industrial applications. To our knowledge, the proposed approach has not 
previously been used to tackle dynamical boundary identification problems, 
although the main component in the calculation - the sideways diffusion equa- 
tion - has been extensively investigated. 
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The sideways diffusion equation is a useful formulation in boundary esti- 
mation when the boundary shape follows a certain solution value or conforms 
to a specified flux. Then, an iterative scheme can be devised for identifying 
the boundary. Solution of the sideways diffusion equation is feasible through 
discretization of the time variable, using a stabilizing approximation, and solu- 
tion of the resulting system of ordinary diflFerential equations [8, 10]. Possible 
stabilizing approximations are central and forward differences [9, 10], Fourier 
transform, wavelets or mollification filtering [11]. Our approach follows that 
of Weber [22], who approximated the sideways heat equation by a well-posed 
hyperbolic partial differential equation. In fact, such a model of heat conduc- 
tion (frequently termed the telegraph equation) had been proposed by Morse 
and Feshbach [16], who pointed out the nonphysical nature of instantaneous 
heat transfer in space permitted by the conventional parabolic heat conduction 
model. 



2 Hyperbolic regularization 

Consider the hyperbolic system in the static and bounded domain 



0 ^1^(x, t) 
dx^ 


= + (».')€|0,Slx|0,,,[. 


(1) 




II 




(2) 




li(x, 0) = Uq{x) 


, u{x,tf) =Uf{x) , 


(3) 



Here, 7 > 0 is a small regularization parameter and u/ an arbitrary function. 
For consistency, it is presumed that Um{ 0 ) = t^o(O) and Um{tf) = Uf{ 0 ). The 
characteristics of (1) are straight lines with slopes dzy and since the domain 
of dependence for the solution at any point is bounded by the characteristics 
through that point, influence of the endpoint condition u{x,tf) = Uf{x) be- 
comes negligible as 7 becomes small. Since (1, 2, 3) is well-posed and, as will 
be seen, a regularization of the conventional sideways diffusion problem, it can 
be used as an approximation for analysis of the conventional problem. There is 
a well-established theory for this type of partial differential equations, ensur- 
ing existence, uniqueness and stability of the solution for realistic measurement 
data Um{t)^ qm{t)- In the case of a moving boundary, (x, t) G ]0, s(t)[ x ]0, t/[, 
to be determined from the measurements, a priori conditions on the solution 
together with the hyperbolic regularization ensure a meaningful solution. 

Conceptually (1, 2, 3) is an admissible regularization of the conventional 
sideways diffusion problem, where it is desired to obtain the solution and flux 
at X S', if it is required that 7 | 0 as the measurement error diminishes to 
zero. Assuming the solution and flux at the origin in (2) can be approximated 
by solution of a direct parabolic diffusion equation in x G ]— 00, 0[, the solution 
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dit X = S takes a particularly simple form. Then, the solution trajectory f{t) 
at the position S can be solved from the Volterra integral equation 



{K f){t) = Um{t) , 0<t<oo, (4) 



where the integral operator is defined by 



{Kfm = 



^ fjr) -S^ ' 

2 \/^ Jo {t - [^{t - t) _ 



dr. 



(5) 



The integral equation (4) is inherently ill-posed although exact knowledge of 
on any finite time horizon uniquely determines the sought temperature 
trajectory f{t) on the same time interval. The following Lemma was proved 
by Carasso [5], using auxiliary results in [13]. 

Lemma 1 Lettf > 0. Then, V p, l<p<oo, K is a compact, linear operator 
in [0,^/]. Kf — 0 implies f = 0. Thus, K~^ exists and is unbounded. 

In the literature, various additional regularization methods are reported in 
conjunction with iterative solution of the sideways diffusion equation. Filtering 
has been done, either directly in the frequency domain [18], through higher- 
order finite differencing [20] or mollification [14, 15, 2]. Wavelets and spline 
approximation have also been applied in some cases to ensure well-behaved 
computation [3, 11, 2]. For our purposes, regularization beyond hyperbolic ap- 
proximation is unnecessary and would only require further tradeoffs in solution 
accuracy. 



3 Boundary identification 

3.1 Existence, Uniqueness and Stability 

If the space and time variables of (1, 2, 3) are interchanged, a conventional 
hyperbolic initial-boundary value problem is obtained, to which a solution 
u{x,t), {)< X < S can be obtained using Riemann functions [21, 19, 22]. 
From such a solution, existence, uniqueness and continuity with respect to 
initial data, here the measurements Um{t) and qmif) in (2), can be verified. 
Furthermore, as the solution and distance are connected through the fiux, 
stability of the boundary identification problem follows whenever a nonzero 
fiux and parameter 7 may be presumed. More interesting is the error estimate 
of how well the 7 > 0 approximates the case 7 = 0. Such an estimate was 
obtained by Elden [7], indicating the same log-convex behavior as for stability 
of the sideways heat conduction equation [9, 11]. 

A different approach to investigating existence, uniqueness and stability of 
the boundary identification is to convert the problem to a coefficient estimation 
problem by introducing the spatial variable y{t) = x/s{t). For 7 = 0 the 
objective is then to find the solution (y[y,t), 1/ s‘^{t)) to 
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1 d'^v{y, t) ^ dv{y, t) 

s^{t) dy'^ dt 



(y,t) e ]0,1[ X 



( 6 ) 



v{ 0 ,t) = Um{t) , =qm{t), (7) 

v{y,0) = vo{y) , v{y,tf) = Vf{y{tf)) . (8) 

With certain restrictions on the input data Um{t)^qm{t)^ this problem can be 
solved “in a well-posed manner” , quoting [4] , where the problem was treated in 
a modern way applied to a multidimensional geometry. For the one- dimensional 
situation in (6, 7, 8) the input data must be differentiable and at least one of 
the measured functions must be monotone [12]. For our purposes this formu- 
lation is too restrictive, as in particular Um{t) may be an oscillating function 
and all inputs may contain noise. The coefficient estimation setting is useful, 
however, in demonstrating the requirements on the data to achieve a well- 
posed computation and in illustrating the degree to which our problem can be 
considered ill-posed. 

An alternative strategy for stabilization at 7 — 0 with respect to input 
noise can be formulated in a case where s{t) S and an a priori bound 
u{S,t) < M as in (9) exists with u{s{t),t) <C u{S,t). When the solution 
and flux at the origin in (2) can be approximated by solution of a direct 
parabolic diffusion equation in x G ]— oo,0[, a bound on the measurement 
error amounting to e results in a log-convex stability bound proportional to 
j^x/s^i-x/s solution [9, 11]. For 7 > 0, a similar slightly less tight 

bound has been established [7]. 

Discretization of the time variable in (1, 2, 3) and solution of the resulting 
system of ordinary differential equations in the space variable has a stabilizing 
effect as such, since this prevents blow-up of high-frequency components in 
the solution [9]. Discretization represents the unbounded time-differentiation 
operator on the left-hand side of (1) by a bounded matrix and although of 
high condition number it limits magnification of measurement noise. The rec- 
ommended step size for time discretization is t/(log(M/e))“^/2 [9, 10]. 



3.2 Preliminaries 

Consider now the situation depicted in the (a:,t)-plane in Fig. 1 for (1) having 
a moving boundary s{t) >0 and a specified solution function on this boundary, 
i.e., u{s(t)^t) = Us{t). It is presumed that this function is such that there exists 
an intermediate value solution 5(t), i.e., 

Um{t) < Us(t) < u{S,t) y 0 <t<tf. (9) 

Furthermore, for solution uniqueness it is presumed that the desired boundary 
trajectory is found when reaching the correct boundary solution function the 
first time according to the procedure: 
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Fig. 1. The relation between domains for the classical sideways diffusion problem 
and the boundary identification problem for a finite time horizon 



— As continuation proceeds and the solution boundary condition is reached, 
the solution in Vs \ Vs is put equal to Ug {t) . 

— As continuation proceeds and the solution boundary condition is reached, 
the flux — in Vs \ Vg is put equal to zero. 

Keeping in mind the regularized formulation of the problem as a damped wave 
equation ( 1 ), it is clear that the speed of boundary movement s' (t) must not 
exceed the wave velocity I /7 in absolute value. The rather restrictive assump- 
tion (9) together with the above procedure ensures (by a simple intermediate 
value argument) that the condition u[s(t)^t) = Ug(t) can be satisfied for almost 
all times yielding a unique s{t). Hence, conditions (i) and (ii) of the introduc- 
tion can be considered fulfilled through formulation, noting that (9) is realistic 
for the class of industrial boundary identification problems we have in mind 
for application of the method proposed in this paper. Remains condition (iii), 
which is tackled as follows: accept possible invalidity of (iii) (for 7 = 0 ) and 
devise a method diverging slowly enough to amplify moderate measurement er- 
rors in Um{t)^ qm{t) to within bounds of resolution. A Lax-Richtmyer analysis 
is given below for theoretical discussion of this issue. 



4 Computational method 

4.1 Cauchy Problem 

Transform ( 1 , 2 , 3) to an abstract Cauchy- problem applying the method of 
lines by using subscript x for spatial differentiation 
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Here, the initial and endpoint conditions (3) will be incorporated into the 
differential operator in the matrix of (10). 



4.2 Solution Procedure 



Adopting a difference approximation to the differential operator 

the sampled measurement data 

U{0) = [Wm(0) Um(At) . . . UmiNAt)]^ 

Ux(0) = [9m (0) qmiAt) . . . qmiNAt)]^ 



( 12 ) 



(13) 



can be used for integration of (10) to obtain the discrete solution profile 

U{x) = [u(x,0) u{x, At) ... u{x^NAt)]^ , 0 < x < s(t), (14) 



Direct numerical integration yields solution trajectories (functions of time in 
vector form) at positions 0 < x < s{t). When an element of the solution vec- 
tor reaches the boundary condition u{s{t),t) = Us{t), this element is excluded 
and the time interval split into two subintervals for which the computation 
is repeated. To obtain the boundary position at the element, record the in- 
tegration distance. Depending on the shape of the initial- value vector (13), 
a small number of such interval divisions are necessary to satisfy the bound- 
ary condition for most of the nodes. For each subinterval, starting point and 
endpoint conditions are not known (except for the first interval starting point, 
specified by the initial condition at t = 0). Therefore, it is reasonable to as- 
sume that the solution extends smoothly across these points and to introduce 
a corresponding linear extrapolation [10] into the difference scheme (12). 



4.3 Lcix-Richtmyer Analysis 

Applying the Lax-Richtmyer theory [17] the discretized solution operator 
of (10), viewed as the mapping from the noisy sensor data to the unknown 
boundary, can be investigated [6]. The system (10), discretized through (12) 
can hence be considered as a space marching scheme 

+ [It o] +^) ■■=C(Ax,At) (x). (15) 
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Fig. 2. Amplification factors for the analytical case, two-point central (S3 in [ 6 ]) 
and four-point forward difference approximations ((17) for 7 = 0) and the forward 
difference hyperbolic regularization in (17) 



Introducing the Fourier image G{Ax, 0)^ in terms of the normalized frequency 
0 , of C{Ax^ At)^ one has the relation 

lie'll! = |G(6I)^|,2, (16) 

where || • \\ denotes the norm in the time variable, | • |^2 the Euclidean norm 
and P is the maximum number of spatial steps required to reach the desired 
boundary. The matrix G{ 6 ) is 2 x 2 and, e.g., a four-step forward difference 
approximation to combined with a central difference approximation to ^ 
in (12) yields 



G[ 6 ) = 









Ax 

0 



(17) 



For comparison, the corresponding analytical solution operator for one space 
step Ax is a convolution kernel with the Fourier image The mag- 
nitude of this is g{Ax^ 6 ) = to which \G{Ax^ 6 )\i 2 is a discrete 

approximation. Thus, marching the input data P spatial steps to reach the 
desired boundary, amplifies the noise by a factor g{Ax, 6 )^. Alternative dis- 
cretizations (12) are treated extensively by Carasso [6], including the cases 
with 7 = 0 and central difference as well as the forward difference approxima- 
tion of ^ in (17). For the sake of illustration, an example computation with 
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exaggerated parameter values was performed, results of which are depicted in 
Fig. 2 together with the analytic amplification factor g{Ax^ 9 )^ for the param- 
eter values At = 0.5, Ax = 5 • 10~^, P = lO'^ and 7 = 0.5. From Fig. 2 it 
appears that the hyperbolic regularization in (17) performs well with respect 
to error growth. 




Fig. 3. (Left) Diffusion profile and flux measured at the origin (solid) together with 
their theoretical trajectories (dashed), for increasing and oscillating thickness tra- 
jectory. (Right) Corresponding identified (solid) and theoretical (dashed) thickness 
trajectories 



5 Example simulation 

A simulation was carried out in order to test the ability of the outlined 
computation method to identify different types of smooth thickness varia- 
tion. To generate corresponding simulated measurements at x = 0, the di- 
rect parabolic problem ((1) with 7 = 0) was solved with the theoretical tra- 
jectories for s{t) and Us{t) as boundary conditions. After superposition with 
a noise (normally distributed, relative amplitude 0.1 %) signal the full curves 
Um{t) and qm(t), depicted in Fig. 3, were obtained. Subsequently, the sys- 
tem (10, 13) was solved with the “measurement signals” as initial conditions 
using a simple explicit Euler-method for space marching with Ax = 10“^, 
At = 7.9 • 10~^ and 7 = 0.01. Hence, the applicable Courant-Friedrichs-Lewy 
condition ^AxjAt < 1 is fulfilled. Identification of the boundary trajectory 
from the direct integration as described above, yielded the results on the right 
hand of Fig. 3. 

The results confirm the feasibility of the used method for situations with 
noisy measurement data. As seen in the figures, the solutions are stable and 
consistent with the theoretical (dashed) curves. It should be noted that there 
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is a natural time lag involved before the variation induced by boundary move- 
ment reaches the sensor location at rr = 0. In the direct integration method, the 
entire time domain is treated simultaneously without account of this causality 
aspect. Causality of the problem would require measurement data for the in- 
terval (1,1.1] for the solution of the final time interval (0.9, 1.0] to be entirely 
trustworthy, despite the extrapolation made for the initial and final solution 
nodes (cf. the discussion in Elden [8]). Thus, if heads and tails of the function 
trajectories are disregarded the results are encouraging. 



6 Conclusions 

A simple one-dimensional, dynamic model for boundary identification was for- 
mulated on basis of direct integration of the sideways diffusion equation. Simu- 
lated, noisy input signals were used to illustrate stability against measurement 
errors and model response to different types of boundary variation. The results 
indicate model feasibility by both of these criteria. 
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Summary. In the present paper, we analyze computational properties of the func- 
tional type a posteriori error estimates that have been derived for elliptic type 
boundary- value problems by duality theory in calculus of variations. We are con- 
cerned with the ability of this type of a posteriori estimates to provide accurate 
upper bounds of global errors and properly indicate the distribution of local ones. 
These questions were analyzed on a series of boundary-value problems for linear 
elliptic operators of 2nd and 4th order. The theoretical results are confirmed by nu- 
merical tests in which the duality error major ant for the classical diffusion problem 
is compared with the standard error indicator used in the MATLAB PDF Toolbox. 
Numerical tests performed show that the meshes generated on the basis of the ma- 
jor ant are very close to those that would be computed if on each step of the mesh 
refinement process we knew the exact error distribution. At the same time, meshes 
generated by the MATLAB code may considerably differ from them. 



1 Introduction 

For several decades the attention of a number of authors has been focused 
on questions of reliability and efficiency of calculations in computational en- 
gineering. These questions are closely related to the progress in the theory of 
a posteriori error control. On the one hand, it is necessary to have guaran- 
teed upper bounds on the errors computed in a suitable norm. On the other 
hand, it is very desirable to also have qualitative indication of their local be- 
havior. These efforts are aimed to decrease computational costs while ensuring 
accurate and reliable modelling of physical phenomena. 

Nowadays, in the framework of Finite Element Methods several approaches 
to error control are used widely. The first of these was formulated at the end of 
70th in the works of Babuska and Rheinboldt (see [2, 3]). Further investigations 
of this subject were pursued by a number of authors and the amount of the 
corresponding literature is very large. The most complete description of the 
methods and the associated literature are given, for instance, in [1, 4, 11]. 

The main idea of the method which we investigate in this paper was intro- 
duced in [8, 9]. An important property of this method (which makes it different 
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from other approaches) is that the Duality Error Majorants (DEM) allow us to 
estimate the accuracy of any conforming approximate solution independently 
of the type of approximation (i.e., majorants are also suitable for methods other 
than FEM). The main advantage of this technique is calculation to guaran- 
teed upper bounds of the energy norm of the error. In principle, bounds can 
be computed as accurately as required (subject to the implied computational 
effort). This fact has been mathematically justified in [6] and [10]. As has been 
shown in [10], the method also provides local indication of the error. Both 
results were confirmed by a considerable amount of numerical testing. 

In view of the above-mentioned properties, it is natural to expect that 
a combination of the DEM error estimator with standard packages will lead 
to highly effective numerical procedures. This expectation is confirmed by the 
tests performed. In this paper, we use the MATLAB PDE Toolbox for mesh 
generation and adaptive refinements. Besides, we compare the major ant with 
the standard error indicator of the toolbox. 

In the very last example, we also present the results of numerical experi- 
ments for the biharmonic problem. 



2 Duality Error Majorant for a 2nd Order Model 
Problem 



In this section, we take the classical diffusion problem with a Dirichlet type 
boundary condition as a basis of our investigations (we state it in the varia- 
tional form and call it the primal problem). 

Problem V. Find u eV such that 



J{u) = inf J(u) , 

vEV 






AVv • Vu — fv 



dx , 



0 

where V := {v | u = u; + uq , w G Vo} and Vq 

It is assumed that is a bounded connected domain in with a Lipschitz 
continuous boundary dQ, f G L^(i7), A G and there exist positive 

constants ai, Q2 such that 

ai|qp <^q-q<o:2|q|^ VqeR", |q|^:=q-q. 

From the theory of the Calculus of Variations, it is well-known that Problem 
V has the dual counterpart Problem P* (see, e.g., [5]). 

Problem V*. Find p* G Q} such that 

7*(p*) = sup^ r(q*) , r(q*) := J (vwo • q* - ■ q* - /«o) dx , 
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where Q} := s q* ^ L^(/?, W^) | /q* • Vwdx = f fwdx \/w eVo 
I n Q 

Solutions of Problem V and Problem P* satisfy the relations 

J(m) = inf J(u) = sup /*(q*) = /*(p*) , (1) 

q*GQ} 

p* ASIu , — V-p* = / a.e. in Q . (2) 

For any conforming approximate solution using the relation (1), we im- 
mediately arrive at the a posteriori error estimate for the energy norm of the 
error e — v — u 

J {AVv — q*) • (Vv — A~^cC)dx , 

Q Q 

where q* is any element of QJ. 

This estimate is mainly of a theoretical value. In many cases, it is practically 
hard to construct conforming approximations that belong to Qj- But, as was 
originally shown in [8, 9] (see, also, [10]), this difficulty can be overcome by 
extending the set of admissible functions for the dual variable. Hence we arrive 
at the estimate: 

l|ef < M{v,(3,y*) := Md(v, I 3,y*) + MR{(3,y*) , (3) 

Md{v, (3, y*) := (1 + /?) j {AVv - y*) • (Vu - A~^y*)dx , 
n 

Mr{/ 3, y*) := (1 + 1//?) j (V-y* + ffdx , 

Q 

where y* is an element of Qy {q* G | V-q* G L^(i7)}, /? is a 

positive number, and is a constant in the Poincar e-Friedrichs inequality 
The functional M is called the Duality Error Majorant. It is defined on the 
pair of free variables (/3, y*). For any v, any values of j3 and y* from R+ xQy 
provide a guaranteed upper bound on the error. However, we should minimize 
M(u,/3,y*) with respect to P and y* to compute a sharp estimate. 

Let us consider a sequence {Q^k}t^i finite dimensional subspaces of Qy 
that possess the limit density property, i.e., for any (5 > 0 and any q* G Qy 
there exists a positive integer ks such that 

.inf II q* - y* IIq* <S yk>ks , II q* 1^. :=1 q* 1^ + || V-q* f 

y 

(we denote by | • | two different L^-norms). 



f:= AVe-Vedx< 
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Let us introduce the functional 



Mk{v) :=M{v,f3k,y*k)= M{v,/3,y*) , 

/3>0,y*€(3^fc 

whose value can be computed by solving an auxiliary finite-dimensional prob- 
lem on the subspace 

Theorem 1. If the sequence {Qv/e}£i possess the limit density property in 
Qy, then 

Mk{v) — ^|||e|p as k oo . 

The proof of this theorem is based on the relations (2) and the above-mentioned 
property. 

It is also important to emphasize the following result concerning the conver- 
gence of the optimal sequence to p* in (see [6]). 

Theorem 2. If the sequence of pairs {Pk^Yk) that minimizes M{v, /3,y*) on 
R_l_ X is such that /3k 0, then 

y*k^p* in 0^ , MD{v,l3k,y*k) ^|||ef , 

and 

MR{l3k,yl) ^0 ask-^oo . 

The theorems state properties of the duality error major ant as a global esti- 
mator. 

Let us denote by e{x) and fik{x) the integrands of the error and majo- 
rant M(u, /?/c, y^), respectively: 

e{x) AVe{x) • Ve(x) , 



Hk{x) := (1 + f3k){AVv{x) - y*k(x)) ■ {Vv{x) - A ^yl{x)) + 

+(1 + l//5fc) (V-yfc(a:) + f(x))‘^ . 

For any positive cr, we define the set 

c i? : \pk{x) - s{x)\ > a} . 

Theorem 3. Under the assumptions of Theorem 2 

meas {12a) 0 for any given cr > 0 as k ^ oo . 

The proof of Theorem 3 can be found in [10]. 

Theorem 3 states that the Duality Error Major ant also provides effective 
indication of the distribution of local errors, and, therefore, this proposition 
has great importance for any mesh refinement processes. 
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3 Duality Error Majorant for a 4th Order Model 
Problem 

In this section, we consider the primal problem related to the biharmonic 
operator and the corresponding dual problem. 

Problem V. Find u G Wq such that 

J{u) = inf J{v) , J{v) := [ ( : VVv - fv] dx , 

veWo 7 \2 / 

a 

0 

where Wq As in Section 2, we assume that / G L^(i7), the tensor 

B = {bijsi} possess the symmetry property 6(y)(si) = for i,j,s,l = 

1, and there exist positive constants cei, a 2 such that 

< B>c : X < a2\>c\‘^ Vx G , |xp x : x . (4) 

Problem 'P*. Find m* G such that 

7*(m*) = sup 7*(n*) , 7*(n*) := / : n*^ dx , 

n*eN; 7 V 2 y 

where N} (n* G \ /n* : VVwdx = Jfwdx , Ww e Wo 

I Q Q 

The relationship between Problem V and Problem P* here is similar to that 

presented by (1) and (2): 

J{u) = inf J{v) = sup 7*(n*) = /*(m*) , 
veWo n*eNJ 

m* = , V-V-m* = / a.e. in i? . (5) 

In this case, the respective majorant was derived in [7] and has the form 

l|ef:= j BVVe:V^edx<M{v,l3,x*) := Md{v,/3,x*) + Mr{(3,x*) ,{6) 
f} 

Md{v, P, X*) := (1 + /?) j (BVVu - x*) : (VVu - B~^x*)dx , 
n 

Mr{P, X*) := (1 + 1//3) c^^/ai J (V-V-x* - ffdx , 

a 

where x* G := {n* G \ V-V-n* G L2(i?)}, fd G M+, Cwq is 

a constant in the inequality 
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II < Cwi? II Ww I Vi(; G Wo , 



and ai is as in (4). 

In general terms, the techniques used for deriving estimates (3) and (6) 
are rather close. As in Sect. 2, we consider a sequence of finite 

dimensional subspaces of and introduce the corresponding functionals 

inf M(u,/3,x*). 

Let us formulate also analogues of Theorems 1-3. 

Theorem 4. U Possess the limit density property in N^, then 

Mk{v) e IP as fc — > oo . 

Proof. By the limit density property, for the exact solution m* of the dual 
problem and any given (5 > 0 we can find ks such that, for k > ks^ there exists 
an element G satisfying the inequality 

l|e^lliv^<^, l|e^ll^^:=l|e^P + l|V-V-e:P, 

where := — m*. Let k > ks; then 

^k{v)= „ inf M{v,p,ft*)<M{v,6,ml) = MD{v,6,ml.) + MR{5,ml). 

Given the relations (5), we consider parts of the majorant M(u, 5, m^): 

j (BVVt; - m^) ; (VVt; - B~'^ml)dx = 
n 

= I (BVVe - e^) : (VVe - B-^el)dx = 
a 

= 1 e p -2 J VVe : eldx + J B~^el : eldx < ||| e p +25 | e | /Val + 5‘^/ai ; 

Q Q 

JiV-V-ml-ffdx = j{V-V-elfdx<5^ . 

Q Q 

Taking into account the multipliers (1 + <^) and (1 + l/<5)C^^/ai, we arrive 
at the estimate |||e|p< Mk(y) <|||e|p -\-5C. Therefore, 

Mk(v) ^|||e|p SiS k ^ oo . 



□ 
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Theorem 5. If the sequence of pairs [j3k^>cD that minimizes x*) on 

R_l_ X is such that j3k then 

m* in , Md{v, h, xl) e f , Mnifik, ^ 0 , 

and 

meas (i?o-) ^ 0 for any given a > 0 as k oo . 



4 Numerical Results 

In this section, we justify the method of Duality Error Major ants as an efficient 
tool for numerical simulations. During the last few years we have performed 
various types of tests (see [6, 10] and other papers cited therein). In these 
tests, it was observed that the method is accurate and robust. The next step 
in our analysis, which is considered in the present paper, is to investigate the 
behavior of the DEM in the process of adaptive mesh refinement. Certainly, in 
a short note it is impossible to describe all of the tests that we have performed. 
Therefore, we select only one interesting and representative example. 

The main purpose of our investigations is to compare the efficiency of 
different error indicators in the process of mesh adaptation. It is important to 
emphasize that all refinements are based on the same principle (the marking 
strategy is quite standard: we flag elements if the corresponding local error is 
greater than half of the maximum local error). Therefore, any difference in the 
final results is due only to the differences in the efficiencies of the approaches. 

We have used three error indicators. The first indicator is computed by 
comparing an approximate solution v with the exact solution and provides 
an objective judgement of the quality of mesh adaptations. The value of this 
reference error indicator on an element T is denoted by rj^: 

:= j A{Vu - Vv) ■ (Vu - Vv)dx . 

T 

Two further indicators to be compared with rj^ are defined as follows: 
a local indicator based on the Duality Error Majorant that arises from (3): 

:= (1 + /3) / (Wt; - • (V^; - A~^yl)dx + 

T 

+ {l + l/(3)C^Ja, JiV-yl + ffdx, 

T 

and the standard local indicator of the MATLAB PDE Toolbox: 
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VT-=C,\\hTf\\T+C2h Y. h%[nE ■ {AVuh)A . 

[ ^ EedT\dQ J 

In general, the numerical tests performed can be classified into two groups. 
For those in the first group, it was observed that the standard error indicator 
provides adaptively refined meshes of suitable quality. For this group, the DEM 
is also preferable but its advantage is not very considerable. However, for the 
second group of examples, the distinction is rather more obvious. Below, we 
present such a case. 

Example. Let us consider the classical problem 

f —V • {AVu) = f in i? , 

It = 0 on dO , 

where the values of A and / are given in Table 1 (the subdomains of i? are 
depicted in Fig. 1). From Fig. 2, we conclude that the local errors computed 
by the DEM reproduce the actual distribution on the initial mesh with high 
accuracy. We also observe a disadvantage of the standard approach - overes- 




Fig. 1. Domain 



Table 1. The matrix A and in the right-hand side / 



Subdomain 


1 


2 


3 


A 


'10 o' 


'1 


O' 


'10 o' 


0 1 


0 


1 


0 1 


f 


1 


0 


1 



timation of local errors in subdomains, where / 0. From this point of view. 
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it is natural to predict that the DEM will lead to a more effective adaptation 
than the one obtained by the indicator 




Fig. 2. Indication on the initial mesh: (E) exact error distribution; (D) error indi- 
cation by the DEM; (S) error indication by the standard indicator 



The results of several mesh refinements are collected in Table 2; we de- 
noted by the corresponding number of degrees of freedom for the meshes 
generated. The quantity % presents percentage of the relative error for the ap- 
proximate solutions. Let us compare similar steps of adaptation computed by 
the reference technique (the 1st block of columns) and DEM (the 2nd block). 
Final meshes obtained on the basis of the indicators are also depicted on Fig. 3. 
For every step, the DEM gives accuracy of approximations, which is very close 
to the optimal. It is worth outlining, that the effectivity index 

of the computed upper bounds is very close to 1 in each step of the mesh re- 
finement. Eventually, the mesh (D15) almost coincides with the optimal mesh 
(El 3) (see Fig. 3). At the same time, the standard approach leads to essentially 
worse results (see the 3rd block of columns). The difference is clearly observed 
for meshes (D16) and (Sll). The same accuracy of computed solutions is pro- 
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Table 2. Mesh adaptation performed by the considered indicators 



(E) 

Iter. 


N 


% 


(D) 

Iter. 


N 


% 


leff 


(S) 

Iter. 


N 


% 


0 


225 19.02 


0 


225 19.02 


1.09 


0 


225 19.02 


5 


862 


6.99 


7 


876 


7.12 


1.07 


4 


1022 


7.79 


7 


1422 


5.42 


9 


1428 


5.50 


1.07 


5 


1929 


5.70 


11 


3268 


3.45 


13 


3376 


3.50 


1.06 


7 


3989 


4.01 


(13) 5552 


2.70 


(15) 5698 


2.74 


1.06 


9 


7417 


3.01 








(16) 7142 


2.38 


1.06 


(11) 13560 


2.37 



vided by the DEM-based technology of adaptation with approximately half 
the number of degrees of freedom. 
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In the very last example, we show that the DEM that stems from (6) also 
leads to error estimates of high quality. 

Example. Let us consider the biharmonic problem with homogeneous boundary 
conditions: 



r A^u — f in i? , 

^ = d^u = 0 on dQ , 

where Q is the unit square. In this example, we aim to show that the DEM 
provides effective error control not only in the framework of Finite Element 
Methods but more widely. For this purpose, we choose the exact solution of 
the following form: u = (I){x 2 ), where (j){x) = (1 — x)"^ x"^. An approxi- 

mate solution is taken in the form v — u cw^ where w = '0(xi, ki) ' 0 (x 2 , k 2 ) 
and 'ip{x,k) = We select such a value of the constant c that the 

accuracy of v is about of 5%. An approximation of the dual variable x* is also 
constructed as a combination of global basis functions (the total number of 
which is denoted by A^b)- The effectivity of the corresponding error bounds for 



Table 3. Efficiency of the DEM for the biharmonic problem 



ki 0.1 0.2 0.5 0.5 

/C2 0.1 0.3 0.5 1.0 



ATb = 144 leff 1.42 1.49 1.52 1.53 
Ab = 196 leff 1.11 1.13 1.13 1.14 



the various cases is presented in Table 3. From these results, we conclude that 
a combination of 196 basis functions is quite enough to provide high-quality 
estimates for the various values of the parameters ki and /c 2 . 



5 Conclusions 

We justified theoretically and numerically that the method of Duality Error 
Majorants (a) provides sharp upper bounds on the global energy norm of 
the error, and (b) it reproduces the local behavior of the error with high 
accuracy. For this reason, mesh adaptations based on the DEM are very close 
to those that would be obtained on the basis of the exact knowledge on the 
error distribution. 
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Summary. The remapping algorithm is an essential part of the ALE (Arbitrary 
Lagrangian- Euler ian) method. In this talk we present such an algorithm based on 
linear function reconstruction, approximate integration and mass redistribution. 



1 Introduction 

Conservative remapping is an essential part of the ALE (Arbitrary Lagrangian- 
Eulerian) method for fluid dynamics computations. This method tries to use 
advantages of both the Lagrangian and Eulerian approaches. 

At first, several time steps of the pure Lagrangian computation are used. 
As the grid moves together with the fluid, it may happen that the grid be- 
comes distorted or tangled due to shear flow. Now comes the Eulerian part 
of the algorithm. We prepare a new rezoned grid and recompute (remap) the 
quantities from the distorted grid to the rezoned one. 

We have several conditions this remapping step must satisfy. It must be 
efficient to be usable in real computations. Total sum of the conservative quan- 
tities must be preserved - the algorithm must be conservative. We do not want 
to create new local extrema, we want it to be local-bound preserving. It must 
be stable and applicable to general unstructured meshes in 2D and 3D. In this 
article we introduce a 3D algorithm, which satisfies these conditions. A similar 
procedure in 2D is described in [1]. 



2 Algorithm Description 

2.1 Problem statement 

Suppose, we have two grids: Lagrangian C = {c} and rezoned C = {c}. The 
grids have the same topology. The rezoned grid is created from the original 
one just by small movement of the grid nodes. There exists some underlying 
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function ^'(r), r == (x, y, z) in the Lagrangian cells (for example g — g = pu^ 
g = pv^g — pw^g — p{e-\- |Up/2), where p is mass density, U == (u,v,w) 
is a vector of velocities and £ is the internal energy). We do not know the 
function itself, we know just the mean values in the grid cells and their masses 
and volumes 



9c = 



Jg{r)dV 

' V{c) 



m{c) 



(c) = I g{r)dV, 




( 1 ) 



Total mass (momentum, energy) in the computational domain i? can be com- 
puted as 



M = 




Y, / 9{T^)dV = ^m(c). 

Vc { Vc 



( 2 ) 



We want to compute new masses m* (c) and corresponding mean values in 
the rezoned cells 



=* ^ rn*{c) 
V(c) 



( 3 ) 



and we want them to be as close to the exact values as possible (m*(c) 
m(c) = fc9(^) ^^)* We also want not to create new local extrema 



> 9~ > 9l 



g^^= max ae. 

CnGC(c) 



min 

Cr^ec{c) 



9cn ? 



( 4 ) 



where C{c) C C is neighborhood of cell c, and to be conservative (total mass 
must be the same) 

m*(c) = M . 

Vc 

If the underlying function is a linear function, we want our method to be exact 
m* (c) = rn{c) = J g{r)dV for g{r) — a bx cy d z . 



2.2 Remapping Algorithm 

We design our algorithm in three stages. In the first stage, we make a piecewise 
linear reconstruction of the underlying function on the original mesh. This can 
be done using different methods, with or without limiters. In the second stage, 
we integrate this reconstructed function to obtain means on the new grid. The 
most natural approach would be exact integration, but it needs computation of 
the intersections of the Lagrangian grid with the rezoned one. This intersection 
is very time consuming in 2D and almost unfeasible in 3D, so we use numerical 
quadrature - swept integration. It does not require finding these intersections 
so it is much faster. The problem is that it is an approximate method and 
it may happen that the local extrema are violated, so we need also the third 
stage - repair - which ensures us this local-bound preservation. 
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2.3 Stage 1 — Piecewise Linear Reconstruction 



We want to reconstruct the underlying function in the form 



Pc{r) = pc{x, y, z)= Pc + Sc {x - Xc) {y - yc) + 5^ (z - Zc) , (5) 



where 

^xdV JydV fzdV 

are coordinates of the cell center and V (c) is the volume of the cell defined in 

( 1 ). 

For computation of slopes we use the limited form 



s^ = ^cS: 



X unlim 



,sy = ^csy 



unlim 



,Sl = $cS, 



'z unlim 



( 7 ) 



where unhm unlimited slopes and is Barth- Jasperson limiter, 

which must be computed firstly. 



Unlimited Slopes In ID we can use just the central difference as the un- 
limited slope. To compute unlimited slopes in 2D we construct a contour sur- 
rounding the cell and use Green’s Theorem. In 3D this would require comput- 
ing intersections of this neighborhood with the original grid, which would be 
too slow. So we must use another method. 

Let’s construct the functional 



F{si,sy,si)= Y, 

c„ec(c) 



/ pc{x,y,z)dxdydz\ 



V{Cn) 



(8) 



for each cell, which measures the sum of differences between the mean values in 
the neighboring cells and average values of the reconstructed function from the 
original cell in the same neighboring cell. We want to minimize this functional, 
so we want the reconstructed function to be as close to the mean values in the 
neighboring cells as possible. 

We easily compute derivative of this functional with respect to all three 
variables and let them be equal to zero. This gives us a linear system 



dF{st.sy,si) ^ 



( 9 ) 



which can be easily solved and gives us our unlimited slopes unhm 
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Limited Slopes For computation of the slopes we use the Barth- 

Jasperson limiter at each cell vertex n and than the minimum of them as 
a cell limiter 

f min A for -Pc>0 

= < min (l, forpr“”-Pc<0 = (10) 

[ 1 for - Pc = 0 , 

which ensures us preservation of local extrema and also preservation of a linear 
function. Here is the value of the reconstructed function (using the 

unlimited slopes) in the node n. It is described in details in [2]. 

Integration over an Arbitrary Polyhedron The only part in the func- 
tional, we don’t know, is the integral 

J pc{x,y, z) dx dy dz . ( 11 ) 

Cn 

We also need to compute the integrals in the definition of cell centers Xc, yc^ 
Zc (6) and cell volumes Vc^ (1). So we need a method for integration of the 
linear function over an arbitrary polyhedron. We note, that the boundary of 
the polyhedron is uniquely defined , we know just the vertices of each face. If 
the face vertices do not lie in one plane, the face is curved and the boundary 
is not uniquely defined. 

We demonstrate our integration procedure for the example of the cell vol- 
ume, the integration of an arbitrary linear function is similar. The cell volume 
can be written in the form 

F(c) = j^dV=^ I div{x, y, z) dV (12) 

C C 

and using the Divergence Theorem we can rewrite it as an integral over the 
boundary dc 

V{c) = \ j{x,y,zY -SdA. (13) 

dc 

Here the superscript means the transposition of a vector and S is the vector 
normal to the boundary. The boundary integral can be split into the sum over 
all faces 77 of the face integrals 

= \ E [i^,y,^y-sdA (14) 

^ n^dvfj 

Now just by averaging the coordinates of vertices of each face we compute its 
center, connect it with all face vertices and split these face integrals to the 
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integrals over such defined triangles A. On each this triangle the face normal 
S is constant so it can go in front of the integral 




Now we project all triangles to the coordinate planes. For each triangle we 
select the coordinate plane in which the triangle has the biggest area. This 
ensures us that we do not get into trouble due to numerical problems. Using 
Green’s Theorem we reduce these integrals over triangles to ID edge integrals, 
which can be computed directly from vertex coordinates. This algorithm gives 
us a method for computing the integral of the arbitrary linear function over 
an arbitrary polyhedron. More details can be seen in [3] . 

2.4 Stage 2 — Swept Integration 

Swept region quadrature concept has been explained in detail in [1]. 

The swept region is created by the movement of the face from the original 
grid to the new position. It is bordered by the old face, the new face, and by 
not necessarily flat quadrilaterals connecting each edge from the original face 
to the edge of the new face. We can compute the volume and mass of a such 
region - we talk about swept volume and swept mass. We use these terms in 
their signed sense. Suppose we have a cell on the original mesh and we move 
just one face as illustrated on the Fig. 1. In this case, the right face moves 




outward from the original cell and the middle part is the swept region. In fact, 
all faces can move in different ways and swept regions can be tangled. If most 
of the swept region goes outward from the original cell, the swept volume and 
swept mass are positive, otherwise they are negative. The mass of the swept 
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region is computed by integration of the reconstructed function over the cell, 
in which the most swept region lies. The new cell mass can be composed from 
the mass of the original cell and the masses of all swept regions 

m*(c) = m(c) -f- 6m f 

Here / means a swept region from the set J-{c) of all swept regions of the cell 
c. The new mean value can be than computed as 

- m*(c) 

and as noticed before, it can violate the local bounds due to the approxima- 
tion of the integration. So the third stage is necessary to enforce local-bound 
preservation. 



(17) 



(16) 



2.5 Stage 3 — Repair 

The repair stage works as the conservative redistribution of a conserved quan- 
tity. It corrects the overshoots back to their local bounds. At first, we must 
compute these local extrema. For each cell c we define a bound-determining 
neighborhood C (c) , which is a piece of the original grid fully covering the new 
cell. Usually we use the original cell plus its nearest neighbors. We compute 
the local extrema in this neighborhood 

pf-= rain pr^= max (18) 

CnGC(c) CtiGC(c) 

We show the repair for the example of violation of the lower bound 

h<pT\ ( 19 ) 

upper bound is done similarly. At first we compute mass, which is needed in 
the cell to bring the mean value back to the local minimum 

^^needed ^ (^min _ =.) ( 20 ) 

We want our algorithm to be conservative, so we do not just add this mass 
to the wrong cell, but we look for available mass in the bound-determining 
neighborhood. For each neighboring cell we compute the mass 

= raax((^e„ - p™j") V(c), o) (21) 

which can safely be taken from the cell without violating the local bound also. 
The total available mass in the neighborhood is 

c„eC(c) 



(22) 
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If the available mass is too small we extend the sten- 

cil and look for the available mass in a larger area. If there is enough mass 
available, we perform the repair. We bring the wrong value back to the local 
minimum 

m'(c)=/,“‘"y(c) (23) 

and we take the mass from the neighborhood proportionally to the mass avail- 
able 

^^avail 

m'{^) = m(c„) - . (24) 

In [1] we proved that this algorithm succeeds in a finite number of steps and 
the repair stage corrects all local-bound violations. 



3 Numerical Tests 



3.1 Orthogonal Uniform Grid 

In the first example the underlying function is equal to zero everywhere, only 
in a spherical region around the center of the computation domain (0, 1)^ it is 
equal to 1 



1 for r < 0.25 
0 else 






g{x,y,z) 

We define the uniform orthogonal grid in the computational domain 




(25) 
the initial 




function is shown on the Fig. 2. We move the grid as the tensor product 
movement 
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= (1-a) , C"' 

where 



(l-a) Zn+az^^ , 
(26) 



a = 0.5 sin(47Tt) , t = N/N^ay^ , t G (0, 1) - time of Afth timestep. 

(27) 

We make Af^ax = 200 remappings to obtain accumulated errors and to have 
the problematic regions visible. On the Fig. 3 we can see this spherical function 




Fig. 3. 200times remapped spherical function using only unlimited reconstruction 
a) without repair b) with repair 



remapped using only unlimited slopes. This causes more errors, so the effect 
of the repair stage is more obvious. In the a) part of the figure we can see the 
function without the repair stage. The light gray cells show areas where the 
extrema are violated. In the b) part we see the same remapping with repair, 
no values violate the bounds. 

3.2 Tetrahedral Grid with Random Movement 

The second numerical example shows the same cubical computational domain 
with tetrahedral mesh inside. It includes about 9000 tetrahedrons. We use 
the same spherical function as before, we can see it on the Fig. 4. Now, we 
shake the grid randomly 10 times and remap between these grids. In the last 
time step we remap back to the original grid. On the Fig. 5 we can see the 
situation with the usage of the Barth- Jasper son limiter with and without the 
repair stage. Again, we can see several white cells in the a) part, where the 
bounds are violated. In the b) part, the repair stage corrects ever 3 d:hing and 
no probleip with bound preservation is observed. 
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a) b) 



Fig. 5. lOtimes remapped spherical function using Barth- Jasperson limiter a) with- 
out repair b) with repair 



4 Conclusion 

In this article we constructed an efficient algorithm for function remapping 
between two similar grids. It is face-based and usable in 3D unlike the most 
natural exact integration algorithm, which is not feasible in 3D. The algorithm 
is conservative (total mass remains constant), local-bound preserving (does 
not create new extrema), stable and linearity preserving. We presented several 
numerical examples to show, that we can use it for different types of grids and 
grid movements. 
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Summary. A finite difference algorithm for solution of stationary diffusion equation 
on unstructured triangular grid has been developed earlier by a support operator 
method. The support operator method first constructs a discrete divergence operator 
from the divergence theorem and then constructs a discrete gradient operator as 
the adjoint operator of the divergence. The adjointness of the operators is based 
on the continuum Gauss theorem which remains valid also for discrete operators. 
Here we extend the method to general Robin boundary conditions, generalize it to 
time dependent heat equation and perform the analysis of space discretization. One 
parameter family of discrete vector inner products, which produce exact gradients for 
linear functions, is designed. Our method works very well for discontinuous diffusion 
coefficient and very rough or very distorted grids which appear quite often e.g. in 
Lagrangian simulations. 



1 Introduction 

The mimetic finite difference methods preserve fundamental properties of the 
original continuum differential operators and allow the discrete approximations 
of partial differential equations (PDEs) to mimic critical properties including 
conservation laws and symmetries in the solution of the underlying physical 
problem [1]. The discrete analogs of differential operators satisfy the identi- 
ties and theorems of vector and tensor calculus [2] and provide new reliable 
algorithms for a wide class of PDEs. In [3] the mimetic method for parabolic 
diffusion equation has been developed on 2D quadrilateral logically rectangu- 
lar grids and in [4] the method has been developed for stationary diffusion 
equation on unstructured triangular grid. In this paper we apply our ideas 
to the construction of mimetic methods for the solution of parabolic diffu- 
sion problems in strongly heterogeneous materials on unstructured triangular 
computational grids in 2D, capable to treat arbitrary computational region. 
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1.1 Continuous problem 



We consider heat equation with general Robin boundary conditions 

ut — div K gr^id u = f on Q (1) 

au 4- l3{K grad u,n) = on df2 

on an arbitrary 2D region f2 with boundary df2 with possible discontinuous 
diffusion coefficient K and unknown function u. The goal of the paper is to 
develop a numerical method for this problem on given triangular grid. The 
method should work well also on bad quality meshes typically appearing in 
Lagrangian hydrodynamics computations, where we need to treat the parabolic 
part of the model as heat conductivity of fluid. The spatial dicretization will 
be the same as in [4] . For derivation of discretization we will use the first order 
coordinate invariant operators div and grad, so we first transform the heat 
equation (1) into the first order system + div w = /, w — —K grad u where 
w is the heat flux. 

For further analysis we introduce the generalized gradient operator 



Qu — —K grad u 
and extended divergence operator 



D w == 



J divw on Q 
y — (w, n) on df2 



and look at some integral properties of these operators. 
First we note that the divergence Green formula 



( 2 ) 

( 3 ) 



/ div w d i? — 0 (w, n)dS 

Jo JdO 



( 4 ) 



can be written as (D w, 1)// = 0, where the inner product of scalar functions 
(.,.)// on the space H of sufficiently smooth scalar functions on f2 is defined 
by (li, v)h = §QQ u V d S. On the other hand the Gauss theorem 



f udivwdQ—I u{w,n) d S f (w,RT ^ K grad u)d f? = 0 (5) 

Jo J do J O 

can be written as 

(Dw, u)h = (w, Gu)h, (6) 

where the inner product of vector functions (., .)h on the space of vector func- 
tions H is defined by (A,B)h = f^(K~^A,B)d f2. 

Our operators G, D are acting between spaces H and H as G : id — > H, D : 
H — ^ id. The Gauss theorem (6) implies that the operator G is the adjoint 
operator of the operator D in the sense of defined inner products G = D*. 
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1.2 Semidiscrete problem 

The heat equation problem (1) is discretized in time by fully implicit scheme 

..n+l _ . n 

— div i^grad = / on 17, (7) 

P{K grad ,n) = 'tp on df2, 

which can be written in operator form as = Fu^ where the operators 

A and F are given by 

_ J u/At — div K grad u on f? -p _ / + / on 17 

^ \au /3{K grsidu^n) on dQ ’ ^ | on ^17 

For simplicity we assume Neumann boundary conditions in the semidiscrete 
problem (7), so a = 0^ /3 = 1. In this case the global operator A is given by 

A = {1/At - DG) (8) 

where the operator I is the identity inside the region 17 and zero on the bound- 
ary ^17, G is the generalized gradient operator (2) and D is the extended 
divergence (3) operator. One can quite easily [4, 3] show (G is the adjoint op- 
erator of D, G = D*), that the global operator A is self adjoint and positive 
definite A = A* > 0. The same conclusion can be reached also for the case of 
Dirichlet and Robin boundary conditions, [5]. 



2 Spatial discretization 

We first describe the approximation of scalar and vector functions on the 
given unstructured triangulation of the region 17. Triangles of the grid are 
numbered by index i, vertexes by index j and edges by index k with boundary 
edges ordered first. The scalar function u is approximated by the piecewise 
constant discrete function with constant values Ui inside each triangle i and 
with constant values Uk on each boundary edge k. The vector heat fiux function 
w is discretized by point values at the center of each edge by the projection 
Wk of w to the normal to the edge as shown in Fig. 1 The normal fiux is 
continuous across the edges. We define space HC of discrete scalar functions 
(piecewise constant functions inside each triangle and on each boundary edge) 
with natural inner product 

Nt iVeb 

{U, V)hc = VCi + J^UkVk Sk, 

i=l k=l 

where Nt is the number of triangles. Neb is the number of boundary edges, 
VCi is the area of triangle i and Sk is the length of the boundary edge k. The 
space HL of discrete vector functions has the natural inner product 
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f 




Fig. 1. Projection of vector function w to the edges normals at centers of the edges 



(A,B)hl 



Nt 






VCi 

Ki 



Both natural inner products of discrete functions are approximations of inner 
products of continuum functions defined above in section 1.1. 

The traditional definition of the discrete inner product (A,B)i of vector 
functions at triangle i is 



1 ^ 

(A,B), = -E(A,B),y, 



( 9 ) 



J=1 



part of which at the vertex jf of the triangle i is 



(A,B),.. 



Aj^j Bj^j Aj^j—i B j^j —1 (^Aj^j B j^j —1 A j^j — 1 Bj^j^ cos 

• 2 

sm 0.:, 



Such inner product gives the exact gradient of linear functions. 



(10) 



2.1 General vector local inner product 

The general discrete inner product of vector functions at a triangle i can be 
defined by a symmetric positive definite matrix M 

( mil mi2 misX 

(A,B)f = (M-A)-B, M = \ mi2 m22 m23 \ . 

\mi3 17123 ms3 ) 

Any triangle can be transformed into triangle with vertexes (0,0), (1,0), 
{x^y) by moving, rotation and scaling. We continue the explanation here on 
this triangle. The Gauss theorem (5) is applied to our triangle with linear scalar 
function u and arbitrary vector function w. We require the inner product to 
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define the exact gradient which results in a system of 6 linear equations for 6 
variables, elements rriki of the matrix M. Only 5 equations from the system 
are independent and their solution is 

((x - 2)x + + l){x + 1 )S 2 + 3mi2Siy^ 

352^2 

-{x^ + - 1)^1^2 + 3mi2y^ 

3s2v‘^ 

+ y‘^){x - 2)si 4- 3mi2S2y^ 

-(x^ -2x + y‘^)siS 2 + 3mi2y^ 

((a: - l)x + 2 /^ + 1)5iS 2 + 3mi2i/^ 

351^2^2 

where si,S 2 are lengths of the triangle edges si = a/(x — 1)^ + S 2 = 
and mi 2 remains as a free parameter. 

The Sylvester criterion for positive definiteness of matrix M results in a set 
of 3 inequalities which reduces to the constraint on mi 2 > = 5 152(1 + 

(x -\- l){x — 2)fy‘^)f9 after simplification by using quantifier elimination. We 
have found a family of inner products depending on the parameter mi 2 which 
produce exact gradient for linear functions. The traditional scalar product 
defined by(9)-(10) belongs to this family. The free parameter mi 2 can be used 
to improve some properties of our numerical algorithm as its accuracy and 
condition number of the local matrix M [6] . 

2.2 Divergence and gradient discretization 

Before proceeding to divergence and gradient discretizations we need to define 
formal inner products. The formal inner product for scalar discrete functions 
is given by 

Nt N^b 

[U,V]Hc = Y.^iVi + Y^ UkVk, {U ,V) Hc = [MU ,V]hc 

i=l k=l 

and for vector discrete functions by 

Ne 

[A, B]hl = ^ MBk, (A, B)hl = [TA, B]hl 

k=l 

where we have introduced the operators M, L which connect the natural and 
formal inner products. Note that the formal inner products are plain sums of 
discrete values products while the natural inner products approximate inner 
products on the spaces of continuum functions. 



mil = 
mi3 

m22 

^23 = 
^33 = 
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The discrete operators divergence D and gradient G act between discrete 
functions spaces as D : HL HC,G : HG — ^ HL. The divergence Green 
formula (4) can be written in the discrete case as {DW^ ^)hc = 0 and when 
applied to one triangle i it gives us the discretization of the divergence 

1 ^ 

{DW)i = W,jS,jsign{j^+^ -jf) 

* J=1 

inside the triangle i (sign(j/“^^ ~ jf) distinguishes the unique direction on the 
edge kf connecting vertexes and jf [4]) and {DW)k = —WkUk on the 
boundary edge k. 

The Gauss theorem (6) can be rewritten in the discrete case as {DW, U)hc = 
(W, GU)ni^ so that the discrete gradient is the adjoint of discrete divergence 
G = When we transform this inner products equality into formal inner 
products using the operators L and M we get 

[W,D^MU]ul = [W,LDW]hl LD^ = D®M. 

Now the formal adjoint D® can be constructed [4] and to get the gradient 
W = GU (which is the natural adjoint Z)*/7) of the scalar grid function U the 
system 

LW^D^MU (11) 

has to be solved. The gradient constructed by this way as adjoint to divergence 
has global stencil. 

The discrete approximation of the global operator (8) is symmetric and 
positive definite and for its inversion, which is needed in each time step of the 
implicit method, we employ the conjugate gradient method. The numerical 
gradient evaluated on every iteration of the conjugate gradient method by 
solving (11) for E is computed by the standard Gauss-Seidel method. Our 
method is exact on piecewise constant or piecewise linear solutions, otherwise 
it is second order accurate. 

2.3 Boundary conditions 

The shortly outlined mimetic discretization incorporates a different kind of 
boundary conditions, namely Dirichlet, Neumann and Robin ones. Each type 
of boundary conditions is treated differently. For Dirichlet boundary conditions 
the value of the scalar function u on the boundary is known, the gradient on 
the boundary is computed and boundary conditions are fulfilled exactly. For 
Neumann boundary conditions the gradient on the boundary is known, we do 
not solve for it, and the boundary flux (gradient) is moved to the right hand 
side of the global system, which is the discretization of the semidiscrete system 
(7). The value of the scalar function u on the Neumann boundary is not needed 
and again boundary conditions are fulfilled exactly. 
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The situation is different for the Robin boundary conditions when both 
value of the scalar function u and the boundary flux (gradient of u) are un- 
known on the boundary. The discrete form of the boundary conditions in (7) 
is included in the global discrete system and the boundaries with Robin con- 
ditions are included in scalar products used for computing conjugate gradient 
coefficients and residuals. More details on numerical treatment of boundary 
conditions by mimetic methods can be found in [5] . 



3 Numerical tests 

In this section we provide several tests of the developed mimetic method. For 
all tests we use initial conditions = 0 everywhere and compute till time 
sufficiently large for the solutions to reach the steady state. Exact discrete 
solutions used in error evaluation are given by the point values of exact con- 
tinuous solution at the median (average of its vertexes) of each triangle. 



3.1 Piecewise linear and quadratic tests 



Both piecewise linear and quadratic tests are solved on the region (x,y) G 
(0, 1) X (0, 1) with a discontinuous piecewise constant diffusion coefficient 

, _ J /ci, 0 < X < 0.5, 

\ k2, 0.5 < X < 1. 

We solve these problems for particular values of diffusion coefficient k\ = 
1, /c 2 = 2 till time ^ = 10 when the solution converges to the stationary one. 
Of course the triangulation is done in such a way that the whole discontinuity 
line X = 0.5 is covered by the edges so that inside each triangle the diffusion 
coefficient is constant. 

The stationary exact solution of the piecewise linear test, coming e.g. from 
[7, 3] , is a piecewise linear function 



u = 



k2X-{-2kik2 
0.5(A:i+/c2)+4A:i/c2 
fcia;+2fcifc2+Q-5(/c2 — fci) 
0.5(/ei+/c2)+4/ci/c2 



0 < X < 0.5 
0.5 < X < 1, 



The maximal numerical errors for this test are shown in Table 1(a) and are 
close to machine precision, showing that our method is exact for piecewise 
linear solutions. 

The stationary exact solution of the piecewise quadratic test, coming e.g. 
from [8, 3], is a piecewise quadratic function 



f 0 < X < 0.5 

a2%- + l)2X + C2 0.5 < X < 1 




Mimetic Finite Difference Methods for Diffusion Equations 375 



where ai = -l/h, hi = (3a2 + ai)/c2/(4(A:i + /^ 2 )), ^2 = C 2 = 

— &2 — ^^ 2 / 2 . The convergence analysis for this test is presented by maximum 
errors and numerical order of convergence in Table 1(a) and confirms that our 
method is second order for non-linear solutions, even in the case of discontin- 
uous diffusion coefficients. 

3.2 Anisotropic triangulation 

One of our aims was to develop a method working well also for bad qual- 
ity, rather distorted triangular grid which appears quite often in Lagrangian 
meshes moving with the fluid. To show how our mimetic method works on bad 
quality grids including triangles with big angles we choose the initial grid on 
the region (x, y) G (—1, 1) x (0, 1) as shown in Fig. 2 (a) and stretch this grid by 
a parameter a producing bad quality grids on the region (x^y) G (— a, a) x (0, 1). 
The initial grid stretched by parameter a = 5 is shown in Fig. 2 (b). We solve 
one problem on series of grids obtained by stretching by increasing parameter 
a. 




Fig. 2. Grid used for stretching the triangulation; (a) for parameter a = 1, i.e. 
X G (—1,1), (b) for parameter a = 5, i.e. x G (—5, 5) 



The problem with anisotropic triangulation is heat equation (1) with diffu- 
sion coefficient K = 1, right hand side / = —2jo? and zero Dirichlet boundary 
conditions on left and right and zero Neumann boundary condition on top and 
down. The stationary solution of this problem u = / o? — 1, This problem 

is solved till time t = lOa^ when the solution reaches the stationary state. We 
compare the results of our mimetic support operator method with the results 
of standard linear finite element method for which we use its implementation 
in Partial DiflFerential Equation Toolbox in Matlab. This comparison is pre- 
sented by maximal errors in Table 1(b) for stretching parameter a growing 
from 1 to 10 000. Our mimetic method keeps the accuracy well for all values 
of a while finite element method is loosing accuracy already for a = 100. The 
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minimal numerical value of numerical solution, which should be minus one, is 
also presented in Table 1 (b). One can note that the finite element method is 
not resolving this minimum for big a, its solution remains fiat and very close 
to initial condition = 0. The reason for this behavior of the finite element 
method is that for big a linear interpolation introduces zig-zagging in y di- 
rection (there are no edges close to be parallel to y axis) for solution with 
curvature in x direction. This zig-zagging is eating too much of the overall 
energy and the parabola in the x direction is not resolved well. 



Table 1. (a) Convergence tables of maximum errors £?max for piecewise linear and 
piecewise quadratic tests, for quadratic test also numerical order of convergence q is 
shown; (b) Maximum errors and minima of numerical solution of problem with sta- 
tionary solution u = jo? — 1 on grids stretched by parameter a by mimetic support 
operator method (MSOM) and standard linear finite element method (FEM). 



Nr. of 


piecewise 


piecewise 








MSOM 


FEM 


triangles 


linear test 


quadratic test 






a 


Emax min{u) 


Ernax min(ju^ 




Em ax 


Em ax 


q 




1 


0.011 -1.0087 


0.0039 -1.0028 


126 


1.4 ■ 


0.0024 


1.96 




10 


0.0055 -1.0032 


0.072 -0.95 


504 


2.5 • 10"“ 


0.00063 


1.98 




100 


0.0055 -1.0031 


0.88 -0.12 


2016 


8.9 • 10-'2 


0.00016 


1.99 




1000 


0.0055 -1.0032 


1.0 -0.0013 


8064 


9.4 • 10-^2 


0.000040 






10000 


0.0074 -1.0034 


1.0 -0.00001 



(a) (b) 
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Summary. The main focus of this paper is on stable FE-discretisations for treat- 
ing systems of partial differential equations arising in glaciology. The systems are 
coupled ones, consisting of a flow problem determining stress, pressure and velocity 
and evolution problems for temperature and mean orientation densities, describing 
anisotropic material behaviour. The proposed strategies are applied to a standard 
model for describing ice sheet dynamics and an enhanced one, taking into account 
the developement of certain fabrics in the structure of the ice. 



1 Introduction 

Climate, climate history and climate forecast have become more and more 
important. If one wants to conceive changes of climate in the past and to 
make precdictions for the future, the climate of an ice-age has to be studied. 

In the context of climate simulations the flow of polar ice masses represents 
an essential part. In so far there emerges a need for appropriate climate bound- 
ary conditions, Greve e.a. [11], Huybrechts [13], Fabre e.a. [3]. Furthermore 
there is a need of an appropriate description of thermo-mechanical material 
behaviour. For instance simulations considering future climate need reliable 
constitutive relations to generate reliable predictions about the global hydro- 
balance. In addition problems emerged from cold region structural engineering, 
e.g. Calov e.a. [1], are often become more reliable through such flow simula- 
tions. On the other hand, because it may record the past history of ice and 
climatic changes and because it is sensible to ice sheets deformation history, 
the microstructure of polar ice is worth studying. 

The growth and retreat of inland ice masses is governed by the snowfall 
onto the surface, the melting and calving of the ice close to and at the outer 
ice boundaries. Owing to its own weight, the ice deforms with velocities of 
typically 100 meter per year causing a transport of ice towards the ice sheet 
boundaries where the ice melts and calves. This process, in turn, is influenced 
by the temperature distribution within the ice, implying a delicate balance be- 
tween the thermal and mechanical regimes that are established by the climate 
input and the geothermal conditions of the substrate. The thermomechanically 
coupled ice dynamics together with the mass flux due to snowfall and mass 
loss in the vicinity of ice boundaries determine the thickness distribution of 
a particular ice sheet. 
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The deformation of an ice sheet and the variation of its temperature dis- 
tribution depends to a large extent on its thermomechanical constitutive mod- 
elling. Here, we treat ice as a rheologically nonlinear, thermally coupled, vis- 
cous fluid, i.e., we asume its fluidity (inverse viscosity) to be temperature- 
dependent, the latter according to a power law with exponent 3, the former 
essentially following an Arrhenius- type relationship. 

Ice sheets are cold^ i.e. below their melting point, except at parts of their 
base, where the temperature may reach the melting point. For instance, this 
can be concluded from radio-echo sounding of sub- Antarctic lakes (c.f. Oswald 
and Robin [16]). The usual fluid no-slip boundary conditions only apply when 
basal ice is cold, but at the melting temperature, ice can slide over its base 
(see Paterson [17]). In Fowler [4] it is mentioned, that several models are based 
on the assumption of non-zero sliding velocity, and in some cases this is even 
required in order to obtain a solution (see Morland and Johnson [15] and 
Hutter e.a. [12]). On the other hand, basal topographie in Antarctica is so 
rough, that for the sliding law we should expect v 0 (c.f. Paterson [17], 
Richardson [18]). 

The underlying mathematical model is a coupled system of partial differ- 
ential equations for describing the distributions of velocities, temperature and 
evolution of the geometry of the ice sheets. This system is solved numerically 
by employing the Finite Element (FE) method. The choice of an appropriate 
discretisation is involved to some extent: 

1) assuming the temperature to be known, we have to choose stable elements 
for the saddle-point problem determining stress, pressure and velocity simul- 
taneously. 

2) assuming the velocity field to be known, we have to choose stable elements 
for the convection-diffusion problem determining the temperature. 

The standard fluid model mentioned above is necessarily isotrop and thus 
cannot describe stress-induced anisotropies evident in specimens from bore- 
holes. The extreme anisotropy of the ice single crystal leads to heterogeneous 
intra-granular deformation modes within the polycrystal and hence to the de- 
velopment of a certain fabric. The expectation is, that the climate becomes to 
some extent reconstructable from analyzing ice-core textures, e.g., Thorsteins- 
son [19], in combination with the numerical solution of the ice-sheet flow 
problem. Therefore considerations for an enhanced model are based on two 
geometric scales. 



2 Mathematical model 

The ice in large ice masses is generally pol}d;herm, i.e., the ice mass consists 
of disjoint regions in which the ice is either cold (i.e., its temperature is below 
the melting point) or temperate (i.e., it is at the pressure melting point), 
but except for a few recent cases theoretical formulations are restricted to 
cold ice. For such a case the continuum mechanical postulate ice is a slow, 
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gravity- driven, incompressible, heat-conducting, nonlinear viscous fluid yields 
the following balance laws of mass, momentum and temperature as well as 
constitutive relations: 

Ca — e{v) — 0 Material 

— div v = 0 Incompressibility 
— div a + Vp — f = 0 Equilibrium 
T — AT — as{y) — 0 Temperature 

which hold on a bounded domain i? in The right-hand-side / — (0, —g)^ 
is defined by the gravity force g. C describes the constitutive relation between 
the deviatoric stress a and the strain rate 8{v) — (Vv + (Vv)^)/2 and may 
be identified with the 4-th order fiuidity tensor. The velocity field is denoted 
by = (t’l, the related pressure by p and the temperate by T. 

Usually the relation Ca = s{v) is given by 

s{v) = A{T)G{a)a , 

where A{T) denotes the Arrhenius law and G{a) = |crp according to Glenn. 

The basis for applying the Finite Element (FE) method to the classical 
system is the formulation in the variational setting 

(Cor,r) - {e{v),r) ^ 0 

{diYV,q) = 0 

{(T,e{ip)) - {p,div(p) =(/,¥>) 

and 

{dtT, w) -h {{v • V)T, w) -h (VT, Vw) = {as{v),w ) , 

for arbitrary r,q,(p,w choosen from suitable function spaces described below. 
In order to introduce a more compact setting, we define 




With these definitions, we consider the problem of finding U gV := E x Q x 

V X L with 

V c X L 2 X X H\ fulfilling 

8{U',^) = 0 V^gV, (2) 



with 
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:= {Ca,r) - {s{v),r) - {dWv.q) 

+ ^{^)) - {P: div (p) - (/, (f) 

+ {dtT, w) + ((7; • V)T, w) + (VT, Vw) - {ae{v),w) . 

Here and in what follows, (., .) represents the inner product of a bounded 
domain Q in and ||.|| the corresponding norm. Furthermore 
denotes the standard Sobolev space of L^-functions with derivatives in L^(i7) 
up to the order m, and Hq C is the subspace of -functions vanishing on 

r := an. 

2.1 Boundary conditions 

Sliding (basal slip), as opposed to fabric or temperature-enhanced basal shear, 
certainly occurs on ice sheets, particularly at the margins. We consider it 
unlikely to occur where basal ice is cold, and where shear stresses are close to 
zero. In addition large scale basal roughness may mean that any sliding which 
does occur inland will be very small. Following these remarks we consider in 
test configurations sketched in Figure 1 no-slip boundary conditions, i.e. vi = 
V 2 = 0 on Fx , and, exploiting symmetry, vi = 0 on Fy ^ for the components of 
the velocity field v = (^ 1 ,^ 2 )^. 

Furthermore, we have free boundary conditions for the temper atute T on 
Fx and Fy (due to symmetry) and for v on Fs := dQ \ {Fx U F^). Moreover T 
is prescribed on Fs by data from climate input. 

Now, we calculate the deformation under the assumption of isotropic ma- 
terial behaviour. A plot of the velocity field is depicted in Figure 1. We observe 
that Fs is devided into two parts, the inflow and outflow boundary, defined by 

r_ = {a: G Fs I • n < 0} , (3) 

== {x G Fs I • n > 0} , (4) 

where n denotes the outward unit normal of Fs- This notation will be used at 
the end of Section 4. 



3 Discretisation 

The full discretisation for the system (2) is derived in two steps. First, we 
perform a discretisation with respect to the time variable, yielding a sequence 
of problems continuous with respect to the space variable. In the second step, 
these problems are approximated by the finite element method. 

3.1 Semi-discretisation 

For discretization, the time interval [0, ^m] is decomposed like 0 = to < < 

... < tu into subintervals Im := (tm-i^^m] length km tm — t^-i- 
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Fig. 1. Sketch of computational domains for the test examples including structure 
of the FE-meshes for the benchmark problem 



Integrating in (2) over Im and approximating the integrals by quadrature 
formulas of the type 

f w{t) dt km{o^w^ + {1 — a)w'^~^} , (5) 

Jim 

with some a G (0, 1], yields the time dicrete schemes. The choice of a = 1 
corresponds to the backward Euler scheme, while for a = ^ ^ we obtain the 
Crank-Nicolson scheme. Here, we only consider the simple Euler scheme which 
reads 



0 = B{U^;^) := {T^ -T^-\w) ( 6 ) 

+ (Ccr^,r) - {e{v^),r) - (diw^.q) 

+ (c7-,e((^))-(p-,divvp)-(/-,^) 

+ (((v™ • V)T'", w) + (VT'", Vw) - {a^s{v^),w)'^ . 

In what follows the superscript m is omitted. 

3.2 Nonlinear solution process 

In this section, we describe the algorithm (see e.g. Geiger and Kanzow [7]) we 
employ to solve the problems arising in (6) having the general structure 

- 0 . 



(7) 
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1. Calculate correction by a linear problem 

Lb{D^,^) = B{W-^). ( 8 ) 

2. Perform a damped update 

W+'^=W-aW^, (9) 

where is choosen s.t. 

(10) 

with a constant (5 > 0. 

3. Set j — j and goto 1. 

Remark: The Newton scheme is defined by the choice 

Lb{D\^)=B\W;D^,^), ( 11 ) 

where B'{U^; •, •) is the derivative of 5(-; •) in . 



The problem B{U,^) = 0 has in cases under consideration the special form 
B{U)U-F{U) = 0, 

omitting the argument F to simplify notation. A full Newton-scheme is deter- 
mined by 



LB{D) = du{B{U)U-F{U))D 

= B{U)D + B\U)DU - F'{U)D . 

Neglecting the last two terms, our nonlinear iteration is given by 

C/J+i = w - a^B~'^{W) (B{W)W - F{W)) 

= (1 - a^)U^ + a^B-\W)F{W ) , 

which allows for the interpretation as a damped variant of Kachanov’s method 
(freezing of coefficients) . 

3.3 Spatial discretisation 

In order to obtain approximate solutions of the time discrete problems (6) we 
will apply the finite element method on decompositions Th = {Ti | 1 < i < Nh} 
of i? consisting of quadrilateral elements T, satisfying the usual condition 
of shape regularity. The width of the mesh T/^ is characterised in terms of 
a piecewise constant mesh size function h = h(x) > 0, where hx h\r = 
diam(T). With these notations the discrete solutions Uh of (6) are defined by 



B(Uh;^)=0 V^gV^, 



(12) 
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where stable discretisations are established by a cell-wise constant pressure 
approximation and discontinuous and continuous bilinear functions (Ql) for 
stress and velocity respectively. Temperature is approximated by continuous 
Qi-elements as well, where the corresponding transport dominated equation 
requires further stabilisation, which is provided by the streamline diffusion 
method. For a detailed discussion of this method we refer to the textbook 
written by C. Johnson [14]. 



3.4 Linear solution process 



1) Assuming the temperature to be known, in each nonlinear iteration step we 
have to solve a linear system of the form 



A 0 
0 0 

-BC 0 



a 

P 

V 



0 

0 

r 



with the relations 



A ^ {C{a)a, r) , B ^ (a, £{(p )) , 
C ^ (p,divv?), r ^ 

Identifying U = (cr,p)^ , V = v ^ 6 = 0, c = r and 



we have to treat a general discrete saddle point problem of the form 



A (U 
B 0 [V 



h' 

c 



which can be done by employing augmented Lagrange algorithms (see e.g. 
Glowinski [8]). 

2) assuming the velocity field to be known, in each nonlinear iteration step 
we have to solve an unsymmetric linear system determining the temperature. 
This can be done by using a standard bicgstab-algorithm. 



4 Enhanced Model 

An important by-product of the simulation of the dynamics of large ice masses 
is the determination of the age of the ice as a function of its position where it 
is located today. Hereby, the age of an ice particle is defined as the time that 
elapsed since it fell as a snow flake on the free surface. The problem of the ice 
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age - depth correlation has indeed become a central question in the reconstruc- 
tion of the past climate from many ice cores in Greenland and Antarctica. The 
thermomechanically coupled ice-sheet models yield this information on the ice 
motion through spatial and temporal integration, however, the resulting mo- 
tion depends on the underlying constitutive behaviour usually assumed to obey 
an isotropic, non-linearly viscous flow law, as described above. 

More precisely, such laws should be based upon the microscopic structure 
of the material, which is for ice, as for metals, a polycrystalline aggregate 
consisting of hexagonal single crystal grains. The ice in glaciers is not isotropic; 
at closer examination it is seen to be built by a large number of differently 
oriented, strongly anisotropic ice crystals. As structural elements one may 
distinguish two building stones defining the crystal: first the basal plane along 
which the crystals may relatively easily slide, and second, the unit normal 
vector perpendicular to it, defining the so-called c-axis. 

The anisotropy of ice crystallites is given by these axis, the fine-scale is given 
by the crystallites orientation defined on the sphere whereas the large-scale 
is to be identified with the space of daily experience say Following Gddert 
[10] the fine-scale-structure is actually considered via the second-order struc- 
ture tensor A denoting mean orientation densities and yielding an anisotropic 
material behaviour. 

Therefore our classical system is enhanced by 

A-£:[v,A] (13) 

From now on, C depends on the orientation 



A = (aii,a22,tti2)^ 

of the crystallites the ice consists of. The corresponding evolution of A is 
determined by ^[v. A]. Following Godert [10], 8 is given by 

£:[v. A] = ((ai - l)/dev - 2aiVe)£{v) 

+ Ao(/d - (d 4- 1) A) + WA - AW , 

where W = (Vv — (Vv)^)/2. The value a\ defined by 

denotes a measure of alignment, a\ ^ 0 for randomly distributed c-axes ori- 
entations, whereas ai ^ a if all c-axes are parallel. Furthermore a = 1.2 is 
determined via field data (c.f. Godert [9]) and 7^2 denotes the inner product, 
:= (7d, A^). The macro-space dimension is given by d, Aq controls the 
diffusion due to recrystallisation, e.g. Gddert [10]. 

The operator ((ai — l)7dev — 2o:i'Pa) is realised as a 3 x 3-matrix, applied 
to the vector 
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£(v) = {dxVi,dyV2, {dyVi + dxV2)fV2)’^ . 

Vsi is symmetric and in this note given by 



CL 11 CL 22 —CH 1 CL 22 c 

-aiia 22 0 , 11^22 —c 

c —c 1/2 — 2 ai 2 



with c = ai2(2a22 — !)• The material behaviour is characterised by 






^/dev + (1 - f^)V. 



(15) 



where in accordance to Gagliardini and Meyssonnier [5] (3 = 0.25 can be in- 
terpreted as the ratio of the prismatic and the basal fluidity. Furthermore (i is 
given by 

with a constant /io > 0. In test calculations below we simply choose jiQ = 1. 
For the term WA — AW, one obtains 

WA - AW = - ^^^ 2 ) ^ 2 ^^^ _2^^^ _ 

Remark: The evolution for A has a hyperbolic character. Consequently, 
we can prescribe inflow boundary data for A on F_. On F+ A is left free. 
The discretisation for A is performed using the same strategies as described 
for the numerical treatment of the temperature T. 



5 Numerical results 

The numerical results presented throughout this work are obtained by FE- 
implementations based on the DEAL-library [2]. 

5.1 Benchmark problem 

Here, we consider our model on a bounded domain i? described by 
1 - = 0 £ = 1.83-10-2. 

The structure of the FE-meshes is shown in Figure 1. Especially, we focus on 
the evaluation of the velocity field along axis parallel to the y— axis. 

In order to check our discretisation, we compare the computed solution 
of vi along the vertical line x = 1 for the isotropic case to Vialov’s profile 
(c.f. Vialov [20]), an analytical description for such a situation. The result is 
depicted in Figure 2, demonstrating good agreement between computed and 
the exact solution, denoted by u and f{x) respectively. 
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Fig. 2. Evaluation of u along vertical line 



5.2 Enhanced model 

The computations are performed using about 23365 degrees of freedom. The 
time step is choosen adaptively via 

with 5 = 0.001. Here Ui and U 2 denote the solution at time step m — 1 
obtained by employing Euler and Crank-Nicolson scheme respectively. The 
typical developement of the local step size is depicted in Figure 3 showing 
km to increase in the stationary limit. The considerations are restricted to 
isothermal flow. 

Now, we compare the horizontal velocity along the vertical line x = 1 
for the isotropic (/^ — 1) and the enhanced model {jS — 0.25). In Figure 4 
it is shown, that the enhanced model becomes faster, after the stationary 
limes was reached. This is in agreement with results found in Gagliardini and 
Meyssonnier [6]. 

Eventually, we investigate the influence of Aq, which controls the effect of 
recrystallisation, on the orientation of the c-axis described by the parameter 
a\ in (14). The results are shown in Figure 5. As predicted from theory the 
maximum value of ai is controlled by Aq, and we observe ai — > a as Aq tends 
to zero. 
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Fig. 3. Typical developement of the local step size during the simulation, showing 
km to increase in the stationary limit 




Fig. 4. Evaluation of u along vertical line, demonstrating the enhanced model 
(/? = 0.25) to become faster compared to the isotropic model {/3 = 1) 
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Fig. 5. Evaluation of ai along vertical line for different values of the parameter Ao 



6 Conclusion 

We have presented a stable FE-discretisation for treating systems of partial 
differential equations arising in glaciology. The system is a coupled one, con- 
sisting of a flow problem determining stress, pressure and velocity and evo- 
lution problems for temperature and mean orientation densities, describing 
anisotropic material behaviour. The time discretisation is done by the stable 
backward Euler scheme. Stable discretisations with respect to space are estab- 
lished by a cell-wise constant pressure approximation and discontinuous and 
continuous Qi -elements for stress and velocity respectively. Temperature and 
orientation densities are approximated by continuous bilinear functions as well, 
where the corresponding transport dominated equations require further stabil- 
isation, which is provided by the streamline diffusion method. The proposed 
strategies are applied to a standard model for describing ice sheet dynamics 
and an enhanced one, taking into account the developement of certain fabrics 
in the structure of the ice. 



References 

1. R. Calov, A. A. Savvin, R. Greve, I. Hansen, and K. Hutter. Simulation of the 
antarctic ice sheet with a three-dimensional polythermal ice sheet model. Annals 
of Glaciology, 27:201-206, 1998. 

2. DEAL. differential equations analysis library. available via http://www- 
lsx.mathematik.uni-dortmund.de/user/lsx/suttmeier/deal.html, 1995. 





390 G. Godert, F.T. Suttmeier 



3. A. Fabre, A. Letrguilly, and C. Ritz. Sensitivity of a greenland ice sheet model 
to ice flow and ablation parameters: Consequences on the evolution through the 
last climatic cycle. Climate Dynamics, 13:11-24, 1997. 

4. A.C. Fowler. Modelling ice sheet dynamics. Geophys. Astrophys. Fluid Dynam- 
ics, 63:29-65, 1992. 

5. O. Gagliardini and J. Meyssonnier. Plane flow of an ice sheet exhibiting strain- 
induced anisotropy. In Y. Wang K. Hutter and H. Beer, editors. Advances in 
Cold-Region Thermal Engineering and Sciences, pages 171-182. Springer, 1999. 

6. O. Gagliardini and J. Meyssonnier. About the condition to apply on the lateral 
boundary of a model for local flow of anisotropic ice. to appear, 2001. 

7. C. Geiger and C. Kanzow. Numerische Verfahren zur Losung unrestringierter 
Optimierungsaufgahen. Springer, 1999. 

8. R. Glowinski. Numerical methods for nonlinear variational problems. Springer 
Series in Comp. Physics. Springer, 1983. 

9. G. Godert. Meso-macro model for the description of induced anisotropy of nat- 
ural ice, including grain interaction. In K. Hutter, Y. Wang, and H. Beer, ed- 
itors, Advances in Cold-Region Thermal Engineering and Sciences, pages 183- 
196, 1999. 

10. G. Godert. The use of structure tensors to model the evolution of textural 
anisotropy of polar ice. Ann. Glacial., 2002. submitted. 

11. R. Greve, M. Weis, and K. Hutter. Palaeoclimatic and present conditions of 
the greenland ice sheet in the vicinity of summit: An approach by large-scale 
modelling. Paleoclimates, 2:133-161, 1998. 

12. K. Hutter, S. Yakowitz, and F. Szidarovsky. A numerical study of plane ice sheet 
flow. J. Glacial., 32:139-160, 1986. 

13. P. Huybrechts. The present evolution of the greenland ice sheet: an assessment 
by modelling. Global Planet. Change, 9:39-51, 1995. 

14. C. Johnson. Numerical solution of partial differential equations by the finite 
element method. Studentlitteratur, 1987. 

15. L.W. Morland and I.R. Johnson. Steady motion of ice sheets. J. Glacial., 28:229- 
246, 1980. 

16. G.K.A. Oswald and G.de Q. Robin. Lakes beneath the ant art ic ice sheet. Nature, 
245:251-254, 1973. 

17. W.S.B. Paterson. The physics of glaciers. Pergamon, Oxford, 1981. 

18. S. Richardson. On the no-slip boundary condition. J. Fluid Mech., 59:707-719, 
1973. 

19. T. Thorsteinsson. Textures and fabrics in the grip ice core, in relation to climate 
history and ice deformation. Technical Report 206, Reports on Polar Research, 
1996. ISSN 0176-5027. 

20. S.S. Vialov. Regularities of glacial ice shields movements and the the theory of 
plastic viscous flow. Physics of the movements of ice, lAHS, 47:266-275, 1958. 




Nonreflecting Boundary Conditions for 
Multiple Domain Wave Scattering in 
Unbounded Media 



Marcus J. Grote^, Christoph Kirsch^ and Patrick Meury^ 

^ Department of Mathematics, University of Basel, 
Rheinsprung 21, CH-4051 Basel, Switzerland. 

^ Seminar for Applied Mathematics, ETH Zurich, Switzerland 



Summary. A nonreflecting boundary condition is presented, which generalizes the 
well-known Dirichlet-to-Neumann (DtN) approach for time-harmonic scattering in 
unbounded domains to multiple scattering problems. Because this boundary condi- 
tion allows each scatterer to be enclosed by a separate artificial boundary, the size of 
the computational domain, and hence the computational cost, are greatly reduced. 



1 Introduction 

For the numerical solution of wave scattering problems in unbounded media, 
a well-known approach is to enclose all obstacles, inhomogeneities and nonlin- 
earities with an artificial boundary B. A boundary condition is then imposed 
on jB, which leads to a numerically solvable boundary- value problem in a finite 
computational domain The boundary condition should be chosen such that 
the solution of the problem in i? coincides with the restriction to i? of the 
solution in the original unbounded region. Otherwise spurious reflections will 
appear at B, which will travel back into the interior computational region and 
spoil the numerical solution throughout 17. 

Dirichlet-to-Neumann (DtN) maps yield exact nonreflecting boundary condi- 
tions and thus avoid spurious reflections from B. They are explicitly known 
for various equations or geometries [1, 2, 3, 4, 5]. Once combined with a finite 
difference or finite element discretization inside 17, they lead to a highly accu- 
rate and efficient numerical scheme. 

Here we extend the DtN approach to multiple scattering problems, where ev- 
ery scatterer is enclosed by a separate artificial boundary Bj. See [6] for an 
introduction to multiple scattering. Hence 17 consists of multiple disjoint com- 
ponents, Qj. We derive an exact DtN boundary condition on B, the disjoint 
union of all J5j, by combining multiple contributions from purely outgoing 
wave fields. We present theoretical results that show existence and uniqueness 
of the solution to the (artificial) boundary value problem, as well as numerical 
results that demonstrate the accuracy and efficiency of our method. 
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2 Acoustic waves on two scatterers 

We consider two disjoint scatterers in unbounded two-dimensional space. The 
scatterers may contain obstacles, inhomogeneities and nonlinearity. Let F de- 
note the boundary of all obstacles and the free space outside F. The 
scattered field u = u{r^ 6) is a solution of the exterior boundary value problem 
problem 



in i?oo C (1) 

on r, (2) 

( 3 ) 

The wave number k and the source term / can vary in space, while / may be 
nonlinear. We have chosen the Dirichlet-type condition (2) for simplicity. The 
Sommerfeld radiation condition (3) ensures that the scattered field corresponds 
to a purely outgoing wave at infinity. 

We assume that both scatterers have compact support and that they are 
well separated. In this case, they can be surrounded by two non-overlapping 
disks centered at ci,C 2 with radii respectively, such that in the un- 

bounded domain D outside these disks, the scattered field satisfies the homo- 
geneous Helmholtz equation with a constant wave number k > 0, together 
with the Sommerfeld radiation condition: 

Au -h k‘^u = 0 in L), k > 0 constant (4) 

lim y/r ( -^ — ik] u — 0 (5) 

r— >oo Y OT J 



Au + k‘^u = f 
u = g 

lim ^/r \ — z/;: I iz = 0. 

r— >-oo \or J 



Let Q = i?oo \ D denote the finite domain inside the disks, i? consists of 
two disjoint components, and i?2- is bounded by dQ = F \J where 
T = Ti U / 2 and B = dD consists of two circles B\ and H 2 . To solve the scat- 
tering problem (l)-(3) in the finite domain 17, a boundary condition is needed 
at the artificial boundary B, which ensures that the solution in 17, with that 
boundary condition imposed on H, coincides with the restriction to 17 of the 
solution in the original unbounded region l7oo. 



2.1 Derivation of the DtN map 

Let Di denote the unbounded domain outside B\ and D 2 the unbounded 
domain outside H 2 . We split the scattered field u in the unbounded exterior 
region D into two purely outgoing wave fields izi, iZ 2 which solve the following 
problems: 
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lim y/r 

r—^oo 



lim \/r 

r—^oo 



Aui + k‘^ui = 0 
dr 

Au2 + k‘^U2 = 0 
d 



ik\ui = 0 , 



dr 



ik ] U2 = 0. 



in Di, 



in D 2 , 



( 6 ) 

( 7 ) 

( 8 ) 
( 9 ) 



Either wave field is influenced by a single scatterer and therefore completely 
oblivious to the other. The following proposition shows that solutions to (4)- 
(9) are uniquely determined by the values of u on ui on Bi, and U 2 on B 2 , 
respectively. 

Proposition 1. Let K C be a compact set with smooth boundary. Then 
the exterior Dirichlet problem 

Au 4- k^u = 0 m \ K, k > 0 constant (10) 

u = 0 on dK (11) 

— ik^ u = 0 (12) 

has only the trivial solution. 



lim Vr 

r-^oo \Or 



Proof. Without loss of generality, we can assume 0 G K. Because K is compact, 
there exists an Rq >0, such that every circle Br centered at 0 with radius R 
satisfies Br C R^ \ for all R > Rq. Let u now be a solution to (10)-(12). 
A direct computation shows 



/ 

Br 






d 

dr 



ik 





ds = R [ 


du 


/ 


J 

Br 


dr 



-\-k‘^\u\‘^ ds—ikR 



Br 



du_ 



ds. 



(13) 



From (12) we observe that the left side of (13) tends to zero as R 00 . Next, 
we use Green’s formula and (10) to conclude that 




(14) 



Here d/dn is the derivative in the direction of the normal vector on dK point- 
ing away from K. The right hand side of (14) vanishes because of (11). Hence 
(13) implies that 

lim [ \u\‘^ds = 0. (15) 

R^oo J 

Br 

By Rellich’s theorem (see for example [7], Lemma 2.11), we conclude 



u = 0 in R^ \ AT, 



( 16 ) 



which completes the proof of Proposition 1. □ 
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The solutions u\ and U 2 to (6)-(9) can be explicitly written, in local polar 
coordinates (r 2 ,^ 2)5 as a Fourier series 






n=0 



H^u\kTj) 



27T 

J Uj [Rj , o') cos n{0j 

0 



o') do', Vj>Rj, (17) 



for j == 1,2. The prime after the sum indicates that the term for n — 0 is 
multiplied by 1/2, while Hn^ denotes the n-th order Hankel function of the 
first kind. Now let u G C^(i?oo) be a given function which satisfies (4), (5). 
Let ui and U 2 be the solutions to (6), (7) and (8), (9), respectively, together 
with the following matching condition on B: 



ui-{-U2 = u on jB = U jB2- ( 18 ) 

Both u and ui-\-U 2 satisfy (4), (5) in D — DinD 2 ^ Since u and ui +t ^2 coincide 
on B, they coincide everywhere in the exterior region D. We summarize this 
result in the following proposition. 

Proposition 2. Let u G C^(i?cx)) be a function which satisfies (4), (5). Then 

u = u\-\-U 2 ^ in B U D, (19) 

where ui andu 2 are solutions to the problems (6), (7) and (8), (9), respectively, 
together with the matching condition (18). The decomposition ofu into the two 
purely outgoing wave fields ui and U 2 is unique. 

Proof. Uniqueness: The uniqueness of the decomposition follows from Propo- 
sition 1 and from the linear independence of Hankel and Bessel functions - see 
[8] for details. 

Existence: We define the functions 



'^1 •= u\bi G C^{Bi), V2 := u\b2 ^ C^{B2) (20) 

Then we introduce the propagation operators 

Pi : C°(Pi) C7°(P2), miIbx ^ «i|b2> (21) 

P 2 : C°(P2) C°(Pi), U2 |b, ^«2|s,. (22) 

Explicit formulas for Pj, j = 1,2, in local polar coordinates are given by 
(17), with some coordinate transformations between (ri,^i) and (r2,^2)« The 
matching condition (18) can be written as an operator equation 



f ui 



\U2 



+ K 



ui 

U2 



Vi 

V2 



(23) 



on the Banach space X = C^{Bi) x C*^(B 2 ), where the operator K : X X 
is given by 
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K 




(PM\ 

\PM)' 




(24) 



Note that the operator equation (23) with vanishing right-hand side admits 
only the trivial solution, by the uniqueness proof above. Therefore, if is a 
compact operator, existence and uniqueness of a solution to (23) for an arbi- 
trary right hand side follows from Fredholm’s alternative. When the solution 
to (23) is found, it can be extended by (17) to a solution u\ of (6), (7) and a 
solution U2 of (8), (9), respectively. In their common domain D = D\nD2^ the 
superposition u\ -\-U2 then satisfies (4), (5), and (18) on B. By Proposition 1, 
u = ui + U2 in D. 

It remains to show that the operator K is compact. The propagation op- 
erator Pi is defined by 



Pi[u](e,) = -Y, 






27T 

J u{6') cosn(^i — 6') d9' . 
0 



(25) 



In (25), ri = ^ 1 (^ 2 ) and 9i — ^ 1 (^ 2 ) denote the polar coordinates of the points 
on B2 relative to the center of B\. The truncated version of Pi, denoted by 
Pi ^ N G N, is defined as in (25), with the infinite sum truncated at N . 

Lemma 1. The propagation operators have the following properties: 

1. P(^ : C^{Bi) C^{B2) is a hounded linear operator of finite rank 

2. Pi : C°{Bi) C°{B2) is a hounded linear operator 

3 . ||P/^-Pi|| -^0, AT ->00 

From Lemma 1, compactness of Pi follows (see for example [9], Corollary 

II.3.3). 

Proof (Lemma 1) 

1. This property is obvious from the definition of P/^. 



2. The linearity of Pi is obvious from the definition. We shall now show the 
boundedness of Pi. 

Because the scatterers are well separated, there exists an rmin > Pi? such 
that ri(^2) ^ ^min for all 62 G [0,27t]. Because the function x\Hn\x)\^ is 
monotone decreasing for a: > 0 and n > 1/2 [10], we therefore have 

kri\H^^\kri)\^ < n > 1, (26) 



from which we derive 



Hi^\kri) 


< 


Hk^\kr^in) 


Hi^\kRi) 




Hk:\kRi) 



(27) 



We will now show the convergence of the series J^^i From the asymp- 
totic behavior of the modulus of the Hankel functions for large orders [11], 
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n oo, z G . 



(28) 



we derive the following asymptotic behavior for the ratio \an-\-i / c^n\- 

1/2 

, n ^ oo, (29) 



Q^n+l 



(Xri 



, / (7y^ + 4) (r„4» + 4) ' 
) ( 7 ^ + 4 )^ 



(r„^ir') + 4 



with 



^ p ^ ckR 

7 n •— ;;; 5 n •— ~z • 

2n 2n 



(30) 



Because — > 0, n — > oo and Fn — ^ 0, n — > oo, we find that the expression 

on the right hand side of (29) converges, and therefore 



lim sup 





= lim 


<^n+l 




n— >00 





Ri 



< 1 . 



(31) 



The ratio test now ensures convergence of the series define 



qq •= maoc 

0<$2<27T 






< oo 



and estimate 



||■P^w|loo = max 

O< 02 < 27 r 



[ u{9') cos n{ei-0')d0' 



< max 2 

0<6»2<27T \ 

\ n=0 

oo 

^ ^ll'^lloo ^ ^ 

n=0 



Hi^\kn) 



H^"\kRi) 



Ikll 



(32) 

(33) 

(34) 

(35) 



We conclude ||Pi|| < 2 < oo. 

3. We use the definitions of P\ and to estimate 

oo 

||Pi^-Pi||<2 ^ a„ 

n=iV+l 



(36) 



The right hand side tends to zero as ^ oo. We conclude ||P/^ — Pi || ^ 0, 
N oo. 

This completes the proof of Lemma 1, which implies the compactness of the 
operator Pi . Compactness of P 2 is shown similarly and compactness of K then 
follows. This completes the proof of Proposition 2. □ 
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As a consequence, for any given function u G C^(i?oo) satisfying (4), (5), we 
can determine an explicit relation between the values of u\b on the artificial 
boundary (Dirichlet data) and the values of the normal derivative {dnu)\B 
(Neumann data). This DtN map for u is given by 



dnU = M[ui] + T[u 2 ] on Bi, (37) 

dnU = M[u2] + T[ui] on B 2 , (38) 

u\ + P[u 2 ] = u on ^ 1 , (39) 

P[ui] -}-u2 = u on B 2 . (40) 

The operators M, T and P are defined by 



M[uj]{0j) := [ uj(Rj,e') 



COS n{9j — O') d0\ 



(41) 



j = 1,2 (standard single scatterer DtN operator), 

T[wi]( 6 > 2 ) := T\u2]{6i) := ^(i?i, 6 >i) (transfer operato(:4?) 

■P[wi](^ 2 ) ■= ui{R 2 ,^ 2 ), -P[m 2 ](^i) ■= U2{Ri,9i) (propagation operat((i)f) 



The expressions on the right hand sides of (37), (38) and on the left hand 
sides of (39), (40) are evaluated explicitly by using (41)-(43) and the exact 
Fourier representation (17), which involves some straightforward but technical 
coordinate transformations. 

The matching condition (39), (40) cannot be inverted explicitly, and 
thereby u\ and U 2 eliminated in the DtN map (37)-(40). Instead, one also 
needs to compute the values of ui on B\ and U 2 on B 2 . 

With the DtN condition (37)-(40), we are now able to solve a boundary 



value problem in the finite domain 17: 

Au + k‘^u = f in 17, (44) 

u = g on jT, (45) 

dnU = M[ui] + T[u 2 ] on Bi, (46) 

dnU = M[u2] + T[ui] on B2, (47) 

ui + P[u2] = u on Bi, (48) 

P[ui] +U2 = u on B2. (49) 



In [ 8 ] we prove the following theorem, which ensures existence and uniqueness 
of a solution to the DtN problem (44)-(49). 

Theorem 1 . Assume that the free space problem (l)-(3) has a unique classi- 
cal solution u G C^(l7oo) which satisfies (4), (5). Then the double scattering 
boundary value problem (44) -(49) has a unique solution in 17, which coincides 
with the restriction of u to 17. 
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2.2 Combination with a numerical scheme 

The boundary value problem (44)-(49) can be discretized by any numerical 
scheme suited for the solution of elliptic boundary value problems, for example 
by a finite difference or finite element scheme. Equations (46), (47) is a Robin- 
type boundary condition, combined with (48), (49). The discretization of (44)- 
(49) will lead to a large system of linear equations for the unknowns, which 
are the values of the solution on the grid points, for example. The matching 
condition (48), (49) requires the storage of additional values, namely the values 
of the purely outgoing wave fields Uj on the boundary components Bj^ j — 1,2. 
These auxiliary values are useful during post-processing for the evaluation of 
u outside the computational domain and in the far-held [8]. Details on the 
hnite difference and hnite element implementation of the multi- DtN condition 
(46)-(49) can be found in [8]. 



3 Numerical example 

We consider scattering of an incident plane wave impinging on three sound-soft 
obstacles with kidney-shaped boundaries. The wave number isk = Sir. For the 
numerical solution with a second-order hnite difference scheme we generalize 
the DtN condition presented above, from two to three scatterers. This gener- 
alization is straightforward and explicitly described in [8]. The contour lines 
of the real part of the total held are shown in Fig. 1. In [8] we compared our 
multi-DtN condition with the standard DtN condition applied to one single 
very large domain and showed that the two solutions coincide. 



4 Conclusion 

We have generalized the Dirichlet-to-Neumann boundary conditions for single 
scattering to multiple scattering problems. To do so we have used an expan- 
sion of the scattered held into multiple purely outgoing wave helds. We have 
shown that the multi-DtN condition is exact, which implies that no spurious 
rehections appear at the artihcial boundary. When used in a numerical scheme, 
the multi-DtN condition allows for much smaller computational domains than 
the single-DtN condition, especially when the scatterers are far away from 
each other. Moreover, the amount of work does not increase with increasing 
distance between the obstacles. We have illustrated the use of the multi-DtN 
condition via a numerical example. 
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Fig. 1. Contour lines of the real part of the total field for plane wave scattering on 
three kidney-shaped obstacles with sound-soft boundaries. The plane wave is incident 
from the right and the wave number is k = Sn. The multi-DtN condition is used at 
the artificial boundary components, combined with a second-order finite difference 
scheme in the interior. 
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Summary. We consider ill-posed problems Au = f with operator A E 
A = A* > 0, where H is the Hilbert space and range T^(A) is non-closed. Regularized 
solutions Ur are obtained by a general regularization scheme, including the Lavrentiev 
method, iteration methods and others. We assume that instead of / G 'J^(A) noisy 
data / are available with the approximately given noise level S: it holds \\f — f\\/S < 
const for (5 — > 0. We propose a new a-posteriori rule for the choice of the regularization 
parameter r = r{6) guaranteering Ur(s) — ^ u* for 5 0, where u* is solution of 

problem Au — f. The error estimates are given. 



1 Introduction 

We consider an operator equation 

Au = f, feR{A), (1) 

where A G L(i7, i7), A = A* > 0 is the linear continuous self-adjoint and 
non-negative operator; u and / are elements of the real Hilbert space H. We 
do not suppose that the range R{A) is closed and so in general our problem 
is ill-posed. The kernel N{A) may be non-trivial. As usual in the treatment of 
ill-posed problems, we suppose that instead of the exact right-hand side / we 
have only an approximation / G iJ. 

The approximate solution Ur of the ill-posed problem Au = f is found by 
some regularization method and depends on the regularization parameter r. If 
the noise level 6 with ||/ — / || < S is known, then the proper parameter choice 
r — r{6) guarantees Ur(s) ^ ^ 0, where is solution of Au = /, the 

nearest to the initial approximation uq (see Section 2; often uq == 0). 

In this paper we are interested in the case of approximately given noise 
level 5\ it is unknown, holds the inequality ||/ — /|| < (5 or not. Instead of this 
inequality we assume that \\f — f\\/^ < const for 0 and we give a new 
rule for the parameter choice r = r(J) guaranteeing Ur[s) foi* (5 — > 0. The 

error estimates are presented as well. 

* This work was supported by the Estonian Science Foundation, Grant No. 5785. 
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2 Regularization methods 

We consider the regularization methods in the general form (see [1, 2]) 

Ur = {I- Agr{A))uo + gr{A)f , (2) 

where Ur is the approximate solution, uq - initial approximation, r - regular- 
ization parameter, I - the identity operator and the function ^r(A) satisfies 
the conditions (3) and (4): 

sup |pr(A)|<7r, r>0, (3) 

0<A<a 

sup A^ll - A5f^(A)| < , r>0,0<p<po. (4) 

0<A<a 

Here po, 7 and 7^ are positive constants, a > ||A||, 70 < 1 and the greatest 
value of po, for which the inequality (4) holds is called the qualification of 
method. 

The following regularization methods are special cases of the general 
method (2). 

Ml The Lavrentiev method = (a/ -{-A)~^f. Here uq = 0^ r = Pr(A) = 
(A + po = 1. 

M2 The iterative variant of the Lavrentiev method. Let m G N, m > 1, == 

uo,a G iJ - initial approximation and Um,a = + /)• 

Here r = a“^ c/r(A) = j(l - ^,po=m. 

M3 Explicit iteration scheme (the Landweber’s method). Let 0 < p < 1/||A|| 
be a constant and 

~ '^n—1 f^{AUji—i /) , 72 = 1 , 2 , ... . 

Here r = 9rW = y (l — (1 — Po = 00. 

M4 Implicit iteration scheme. Let a > 0 be a constant and 

ocUfi ~h Avjji — OiUji—\ A f — 1,2,.... 

Here r = n, gr{\) = i (l - (sfx) ) , Po = oo. 

M5 The method of the Cauchy problem. We take the solution of the Cauchy 
problem 

u'{r) 4- Au{r) == /, u{fi) = uq 

for the approximation Ur to the solution of problem (1). Here Pr(A) = 
A(l - Po = 00. 
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3 Parameter choice in case of the known noise level of 
data 

In regularization methods (2) the important problem is the choice of a proper 
regularization parameter r. If r is too big, the numerical implementation will 
be unstable and Ur will be useless; if r is too small, the approximation Ur is 
dominated by the initial guess uq. The rules of the parameter choice can be 
devided into two groups, where the noise level is used in the rules of the first 
group and not used in the rules of the second group. 

If the noise level S with || / — / || < S is known the most prominent rule for 
methods M2-M5 is the Morozov’s discrepancy principle. 

The Morozov’s discrepancy principle [3]. In this rule the regulariza- 
tion parameter r — ro is chosen as the solution of the equation 

\\Aur — f\\ ~ with b = const > 1 . 

The second rule of the first group is the modification of the discrepancy 
principle (MD rule) [4]. In this rule the regularization parameter r = tmd 
is chosen as the solution of the equation 

~ /) II ~ with b — const > 1 , 

where the operator 

_ J / iorpo = oo, 

^ 1 “ Agr{A)y^^^ for po — oo 

depends on the qualification po of the method. 

The discrepancy principle and its modification coincide for regularization 
methods M3-M5 where pQ = oo, but these rules differ for the Lavrentiev 
method and its iterative variant, where \\Br{Aua,m ~ f)\\ — \\Aua,m-{-i — f\\- 
Some useful properties of the MD rule which are important for our further 
study are: 

1) convergence: Wur^n ^ 0 for ^ ^ 0; here is nearest to uq solution 

of problem Au = /; 

2) order-optimality: if uq — = A^v, v e H, ||u|| < ^, p > 0, then ||t^rMD “ 

< CpQ'^6^, 0 <p<po] 

3) quasioptimality: there exists a constant c such that 

“ '^*11 ^ C inf{||(7- A5r.(A))(uo -w*)i| -\-^r5} . (5) 

r>0 

The MD rule has some advantages over the discrepancy principle since 
(i) the MD rule can also be used for the Lavrentiev method; 
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(ii) the method M2 with r = is order-optimal only for the range p G 
(0,po — 1], but not order optimal for p > po — 1; 

(hi) it can be shown that the method M2 with r — is not quasi-optimal. 

However, the discrepancy principle and its modification have an essential 
disadvantage. Namely, these choices are unstable in this sense that if the actual 
error of the right-hand side is larger than b 6 , then the error of the approximate 
solution may be arbitrarily large, independently of the value of the ratio of the 
actual and supposed noise level. For example, if b = 2 and the actual noise 
level is three times larger than the noise level S, which we use in the rule, then 
the error of the approximate solution may be arbitrarily large. 

There are also parameter choice rules, which do not use the noise level 
delta. Sometimes these choices are called as heuristic or delta- free choices. 
The first such parameter choice rule was the quasioptimality criterion [ 5 , 6]. 
According to this rule the parameter was chosen for which the function 
k{r) = r\\Br{Aur — f)\\ has the minimum. Other popular delta-free rules are 
the Wahba’s cross-validation rule [ 7 , 8] and the Hansen’s L-curve criterion 
[ 9 , 10]. Some heuristic rules are also proposed in [11]. 

Although these rules often work well (or even better than the discrepancy 
principle and the MD rule), it was shown by Bakushinskii [12], that one cannot 
prove the convergence of the approximate solution for heuristic parameter 
choice rules. 



4 Parameter choice rule in the case of the approximately 
given noise level of data 

In applied ill-posed problems the exact noise level is often unknown. Therefore 
in the following we assume that the actual noise level is unknown and only 
some guesses about this level can be made. It means that the supposed error 
level J > 0 is given, but we do not know exactly, if ||/ — / || < S holds or not. 
Our aim is to present a rule for the stable parameter choice which guarantees 
the convergence of the approximate solution to the exact solution if only the 
ratio 11/ — f\\/S is bounded in the process ^ ^ 0, and to give some error 
estimates of the approximate solution. 

In the following the function 

(p{r) = ^/r\\A^^‘^B^^‘^{Aur - /)|| = Vr{Br{Aur - f),AB^{Aur - 
plays an important role. 

Note that for the Lavrentiev method and its iterative variant Br = 
(-^ “b ^A) and ^(r) = ^(^ ) ~ /, A(A'u,tj,_i_2,q; /)) ^ 5 

for iterative methods (/?(r) = p{n) = y/n{Aun — /, Aun — /))^^^. 

Rule P. Let 0 < s < 1 and &i, 62 be the constants such that b2 > b\ > Cm, 
where Cm = 1 / 2 , Cm = 1 /V 2 m + 3 , Cm = l/\/ 2 /te, Cm = \/a /2 and Cm = 
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\j\f^ for methods M1-M5 respectively. If (f{l) < 62^ then choose r{S) = 1. 
In the contrary case we find at first T2{S) > 1 such that 

'fir2i5))<b26, ( 6 ) 

(f{r)>bi 6 Vr e [l,r2(^)] . (7) 

For the regularization parameter r(S) we choose the parameter r, for which 
the function t{r) = r^\\Br{Aur — f)\\ has the global minimum on the interval 

[l,r-2W]. 

In iterative methods the following rule P’ can be used for the choice of the 
stopping index n{ 6 ) as the parameter r. For the rule P’ the analogous results 
hold as for the rule P. 

Rule P’. Let 0 < 5 < 1 and b be the constant such that b > Cm- Find 
n2{S) as the first n = 1,2 ,..., for which (p{n) < b 6 . For the regularization 
parameter n{S) we choose n G N, for which the function t{n) = n^\\Aun — f\\ 
has the global minimum on the interval [l,n2(<^)]. 

In [13] the rule for the choice of the parameter was considered, in which for 
the regularization parameter the parameter r2{S) was taken. We can consider 
the rule P as the generalization of this rule, since in case s = 0 these rules 
coincide due to the fact that the function \\Br{Aur — f)\\ is monotonically 
decreasing with respect to r. On the other hand, in case s = 1 the rule P 
is similar to the non- self adjoint analogue of the parameter choice rule by the 
quasioptimality criterion, since we choose the minimum point of the function 
r\\Br{Aur — /)|| for the regularization parameter. The only difference between 
the rule P and the analogue of the quasioptimality criterion is the interval, on 
which the function r\\Br{Aur — f\\ is minimized: the intervals are [l,r2(<^)] and 
[1,0c) respectively. 

In [13] the following results are proven for methods M1-M5: 

(i) for each f G H we have lim (p{r) = 0; 

)>CXD 

(ii) if 11/ — /II < (5, ll'Uo — '^^*11 < M, b > Cm^ then for each r, r > Rm,s = 
CmM/{b — Cm)S we have (p{r) < bS; here Cm = 12\/T5/125, Cm = 
(3/2)(3/2)m^/(m + 3/2)^+3/2, Cm = (3/(2/ie))^/2, Cm = (3a/2)(^/2), 
Cm = (3/(2e))^/^ for methods M1-M5 respectively; 

(iii) if < const tor S ^ 0 then ||i^r2(<5) ~ '^*11 0 for J 0. 

Due to the continuity of the function (p{r) from the property (i) follows 
that the choice of finite parameters V2{S) and r((5) < r2{S) according to Rule 
P is possible. The property (ii) says that if we know a constant M > 0 such 
that ||iio — '^^*11 < M, then it is sufficient to search the parameter r2{S) in 
the finite interval [I^Rm.s]- Note that the function (p{r) is non- monotone and 
therefore in Rule P we must use the conditions (6)-(7) instead of inequalities 
biS < (f{r) < ^2(5. 

Note that the analogues of the results of the paper [13] for non- self adjoint 
problems are presented in [14]. 
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Theorem 1. Let A G A = A* > 0, f e R{A). Let the parameter 

r[8) he chosen according to Rule P. If < const in the process S 0, 

then in methods M1-M5 

||'^r((5) “ '^*11 — ^ 0 for (5-^0. 

Proof. Denote Gr := I — Apr {A). Then we have 

Uj- U^ — Grp{VjQ -f- Qrpi^Afji^f jf) , 

from which with (3) follows that 

||«r(5) - w*l| < ||Gr( 5 )(Mo -u*) +7Cr(5)5. (8) 

To prove the theorem, it suffices to show the convergence of the right-hand 
side of (8). In [13] it is proven that 

r2((^)(5 — > 0 for J — > 0 . (9) 

From inequality r((^) < r 2 { 8 ) and from (9) follows the convergence of the 
second term of (8). To show the convergence of the first term of (8), we prove 
at first that 



»’2('^)||'Sr2(«)(^Mr2(<5) - /)|| ^ 0 for J ^ 0 . ( 10 ) 

We have 

Br{AUr - f) = ABrGr{uo ~ U^) - BrGr{f ~ f) , (11) 

from which with regard to the inequality \\BrGr{f — f)\\ < \\f — f\\ < C6 

follows that 

r*(5)||B,,(,)(24u,,(,)-/)|| <r^(5)||24S,,(5)G,,(,)(«o-w*)||+r^(<5)C<5. (12) 
To show the convergence 

r^(5)||24Br2(5)Gr2(«)(wo - w»)|| ^ 0 for J 0 , (13) 

we consider the cases a) r 2 {S) oo, b) r 2 {S) < f = const separately. If 

^ cxD in the process > 0 then using the Banach- St einhaus theorem we 
can prove similarly as in [2] (p. 43) that r^\\ABrGr{uo — ri*)|| ^Oifr^oo 
(0 < p < 1). Now we consider the case r 2 {S) <f = const. Using (11), (4) we 
get 

r2(5)V2||A3/2B3G - «*)|1 < r2(J)l/2||AV2B3/2^(A«,,(,) - /)|| 

+r2(5)V2||^l/253G^G^^(,^(/_/)|| < 62<5 + 7 i/2||/- /|| < (&2 + ^71/2)5 , 



from which follows that 
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11^3/2^3/2 - W*)l| -^0 for (5^0. 

In [ 2 ] (p. 66 ) the implication 

AGr^{uo — u^) —^0 (n oo) Gr^{uo — u^) ^0 (n — > oo) (14) 

is proven. Similarly we can show that if A^^‘^Bri‘^Gr^{uo — li^c) — ^ 0 (n — > oo), 
then ABr^Gr^{uo — 1 ^^.) — > 0 (n ^ oo) which proves the convergence (13) in 
this case. Now the convergence (10) follows from (12), (13) and (9). 

Let us remind that the parameter r{5) is the global minimum point of the 
function t{r) = r^\\Br{Aur — f)\\ in [l,r 2 (^)]. Therefore from ( 10 ) follows the 
convergence 

r*(( 5 )||Br( 5 )(^Mr( 5 ) -/)|| ^0 for <5 -> 0. 

Using ( 11 ) we get 

r*(^)||>lB^(a)Gr(5)(wo - w*)|| < r^(^)||-Br(5)(^«r(5) -/)|| +r®(<5)C5 ^ 0 

if J-^0. 

From this relation with the implication of type (14) we get the convergence 
||^r(( 5 )('^o — '^*)\\ ^ 0 for ^ ^ 0 , which with ( 8 ) proves the theorem. 

In the next two theorems we give estimates for the error of the approximate 
solution in cases ||/ — / || < S and ||/ — / || > S respectively. The proofs of these 
theorems will be presented in another forthcoming paper. 

Theorem 2 . Let A G A = A"" > 0, f e R{A), \\f — f\\ < S. Let 

the parameter r{5) he chosen according to Rule P with s G (0, 1) and let the 
function t(r) = r^\\Br{Aur — /)|| he monotonically increasing on the interval 
[r((^), r 2 ((^)]. Then for methods M1-M5 the error estimation 

||wr( 5 ) inf {||(/-^5r(^))(wo -M«)|| +7^<^} (15) 

holds^ where h^ = max (^(r)/^ > 62 and R{S) is the greatest parameter 
r{5)<r<R{5) 

for which (p{r) = 62 ^- 

Theorem 3. Let A G L{H^H), A = A* > 0, / G R{A). Let the parameter 
r{6) he chosen according to Rule P with s G (0, 1). Then in case ||/ — / || > S 
for methods M1-M5 the following error estimations hold: 

a) if 5 < 11 / - /II < 5o, where <io := \\Br(s){Aur(s) - f)\\, then 

l|Mr-(5)-M*|| < Ci(6i,6*)j^ inf {||(J-Affr(A))(wo-w*)||+7^’||/-/||} ; 

(16) 

f>) if 11/ - /II > ^ 0 , then 

||Mr(5) - w*|| < (17) 

< C2{bi,b2)( ^T.J . }\ ^ ^>0 “ Agr{A)){uo - u*)|| + 7 »-||/ - /||} ■ 
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5 Numerical experiments 

Discretization of ill-posed problems often leads to the linear systems of alge- 
braic equations with large condition numbers. In our numerical experiments 
we solved many linear systems of equations Au = /, where A was the 100 x 100 
diagonal matrix with the diagonal elements Ai, . . . , Aioo- For generating eigen- 
values A/e of A, solution u = (ui, . . . , uioo)^, noise in / and supposed noise level 
S different schemes were used and the concrete scheme was chosen randomly 
by computer. Schemes for eigenvalues A/e of A and for solution u were 

A/e = rand (0, l)a^“^ , a = 1.3; 2; 5; 10 ; 

A/e = rand (0, l)/c“® , a = 1; 2; 3; 4; 10 ; 

A/e = P/e(rand (0, l))Xk-i , Ai = 1; Pk{x) = x; e~^\xj\Jk ; 

A: = 2, 3, . . . , 100 ; 

Uk — rand (0, l)k~^ , a = —1; —0.5; 0; 0.5; 1; 2; 4 ; 

Uk = i?/e(rand (0, l))uk-i , ui = 1; Rk{x) = x; x^ ; 

A; = 2, 3, ...,100; 

where rand (0, 1) is a random number in interval (0, 1). The exact right hand 
side / was found by formula / = Au and perturbations satisfied the relation 
11/ — /II = 10“-^/^ 11/ II, j = 1, 2, . . . , 13 (Euclidean norms). Concrete random 
perturbations were distributed uniformly by the formula 

fk = /fc + 2(rand (0,1) -0.5)11/ -/II/ i (2(rand (0, 1) - 0.5))2 , 

L i=l -1 

A: = 1, 2, . . . , n 

or all the noise was concentrated on one eigenvalue, chosen randomly: 

fko = fko + \/n(rand (0, 1) - 0.5) ||/ - /||/|rand (0, 1) - 0.5| , 
fk = fk for k^ko. 

For the supposable noise level 5 = ||/ — / ||/d was taken, where values of d were 
1, 3, 5, 10, 20, 50, 100. By the Lavrentiev method 6000 different variants of 
the problem Au — f were solved, parameter r was chosen by the rule P. The 
ratios 

-M*|| 

fof{|l(-f-^5r(^))(wo -M*)IP + ( 7 »'max(||/ - /||,(5))2}V2 
r>0 

were computed, characterizing the coefficients of the quasioptimality (compare 
with formula (5)). The ratio V also shows how well the rule P works in com- 
parison with the MD rule, which uses the actual noise level. Namely, we solved 
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the same problems also by MD rule using the actual noise level and then the 
ratio V was almost 1 (average of V was 0.95, maximum of V was 1.24). 

The results of numerical experiments are given in Table 1 . The results show 
that in the case of the exactly estimated noise level (^d = \\f — f\\/6 — 1) rule 
P works nearly as well as the MD rule (averages of ratio V were 0.991 and 
0.95 respectively). As expected, in the case of the underestimated noise level 
(d > 1) the error of the approximate solution is larger than for the MD rule 
which uses the exact noise level. But the ratio of errors of these approximate 
solutions is relatively small in comparison with the error, made by estimating 
the noise level. For example, if the noise level was d = 100 times smaller than 
the real value, the error of the approximate solution for rule P was only 7% 
larger than for the MD rule which uses the exact noise level (the corresponding 
averages of V were 1.022 and 0.95), and for 96% of problems the ratio V for 
rule P was smaller than 1.5. Numerical experiments also showed that if the 
actual noise level is larger than the noise level used in the MD rule, then this 
rule is not good. For example, if the actual noise level was 3 times larger than 
the noise level used in the MD rule, the ratio V was in most cases larger than 
10 . 



Table 1. The Lavrentiev method. Rule P, s == 0.75, bi = b 2 = 1.5C^ 



II 


Number 
of problems 


Average V 


Maximal V 


% problems 
for which 
V<1.5 


% problems 
for which 
V<3 


1 


851 


0.991 


3.95 


97.70 


99.19 


3 


901 


0.982 


4.74 


97.44 


99.36 


5 


826 


0.984 


7.13 


96.51 


98.88 


10 


873 


0.956 


5.99 


97.89 


99.60 


20 


861 


0.953 


6.83 


97.32 


99.46 


50 


874 


0.974 


11.30 


97.10 


99.08 


100 


813 


1.022 


18.50 


96.03 


98.58 


All cases 


6000 


0.980 


18.50 


97.15 


99.17 
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Summary. In this article we consider goal-oriented a posteriori error estimation for 
the symmetric interior penalty discontinuous Galerkin finite element discretization of 
the compressible Navier-Stokes equations. Numerical experiments demonstrating the 
accuracy of the error estimation and the performance of the adaptive mesh refinement 
strategy will be presented. 



1 Introduction 

In recent years there has been tremendous interest in the design of discon- 
tinuous Galerkin finite element methods (DGFEM) for the discretization of 
compressible fluid flow problems; see, for example, [2, 3, 6, 7] and the refer- 
ences cited therein. The key advantages of these schemes are that DGFEM pro- 
vide robust and high-order accurate approximations, particularly in transport- 
dominated regimes, and that they are considerably flexible in the choice of 
mesh design. Indeed, DGFEM can easily handle non-matching grids and non- 
uniform, even anisotropic, polynomial approximation degrees. 

In this paper we introduce the symmetric version of the interior penalty 
DGFEM for the numerical approximation of the compressible Navier-Stokes 
equations. We then consider the a posteriori error analysis and adaptive mesh 
design for the underlying discretization method. In particular, here we focus 
on so-called ‘goal-oriented’ a posteriori error estimation which bounds the er- 
ror measured in terms of certain target functionals of real or physical interest. 
Typical examples that we shall consider here include the drag and lift coeffi- 
cients of a body immersed in a viscous fluid. By employing a duality argument 
we derive a weighted (Type I) a posteriori error bound which reflects the error 
creation and error propagation mechanisms inherent in viscous compressible 
fluid flows. On the basis of this a posteriori estimate, we design and implement 
the corresponding adaptive algorithm to ensure both the reliable and efficient 
control of the error in the prescribed target functional. The superiority of the 
proposed approach over standard mesh refinement algorithms which employ 
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empirical error indicators will be demonstrated. This paper represents a con- 
tinuation of our previous work presented in the articles [10, 11]. 



2 The compressible Navier-Stokes equations 

Writing p, v = Pi E and T to denote the density, velocity vector, 

pressure, specific total energy and temperature, respectively, the equations are 
given by 

2 2 

V . (^^(u) - ^(u, Vu)) = ^ ^ Vu) =0 in r?, ( 1 ) 

i=l ^ i=l ^ 

where Q is an open bounded domain in Here, the vector of conservative 
variables u, the convective fluxes ff, i = 1,2, and the viscous fluxes i = 
1,2, are defined by u = [p, pvi, pv2, pE] , ff(u) = [pvi, pvivi + hiPiPV2Vi + 
52 iP,pHviY and fV = [0, r^, r2^, -^Ti2V2 + /CTa:J^, respectively. Here, JC 

is the thermal conductivity coefficient and H is the total enthalpy defined 
by H = E p/p. The pressure is determined by the equation of state of an 
ideal gas, i.e., p = (7 — l)p{E — ^v^), where 7 = Cp/cy is the ratio of specific 
heat capacities; for dry air, 7 = 1.4. For a Newtonian fluid, the viscous stress 
tensor is given hj t = p (Vv + (Vv)^ “ ? where p is the dynamic 

viscosity coefficient; the temperature T is given by JCT = ^ (jE — , where 

Pr = 0.72 is the Prandtl number. 

The non-dimensionalized form of the Navier-Stokes equations ( 1 ) are given 
by 

V • — V • (Gijdu/dxj^G2jdu/dxj) =0 in f?, (2) 

where repeated indices are summed through their range. Here, the matrices 
Gij = dfy{\x,V\i)/dua:^, for i,j = 1 , 2 , i.e., f^(u,Vu) = 

2 = 1,2, where 
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and Re denotes the Reynolds number. 

Given that i? C is a bounded region, with boundary i"*, the system of 
conservation laws (2) must be supplemented by appropriate boundary condi- 
tions. For simplicity of presentation, we assume that F may be decomposed 
as follows F = U i~N^sup U i~N sub bJ -/"w, where Fd, F^ and Fw are 

distinct subsets of F representing the Dirichlet (inflow), Neumann (supersonic- 
outflow), Neumann (subsonic-outflow) and solid wall boundaries, respectively. 
Thereby, we may specify the following boundary conditions: u = on Fd, 
F^(u, Vu) • n = 0 on Fn,sup bJFN,sub; note that on FN,sub an additional condition 
which imposes a given pressure Pout is enforced. For solid wall boundaries, we 
consider the distinction between isothermal and adiabatic conditions. To this 
end, decomposing Fw — Fw,iso bJ Fw.adia, we set v = 0 on Fw, F = Fvaii on 
Fw,iso, n • VT = 0 on Fw.adia, where is a given wall temperature; we refer 
to [2, 3, 5, 6] and the references cited therein for further details concerning the 
imposition of suitable boundary conditions. 



3 Discontinuous Galerkin Discretization 

In this section we introduce the discontinuous Galerkin method with interior 
penalty for the discretization of the compressible Navier-Stokes equations (2). 

We assume that i? can be subdivided into shape-regular meshes Th — {n} 
consisting of quadrilateral elements k. For each k E Th, we denote by the 
unit outward normal vector to the boundary and by the elemental 
diameter. An interior edge of 7k is the (non-empty) one-dimensional interior 
of n dn~ ^ where k,~^ and k>~ are two adjacent elements of 7k . Similarly, 
a boundary edge of 7k is the (non-empty) one-dimensional interior of Ok (1 F 
which consists of entire edges of Ok. We denote by Sj the union of all interior 
edges of 7k, by the union of all boundary edges, and set S — Sj U Sp. 

Next, we define average and jump operators. To this end, let and k~ 
be two adjacent elements of 7k and x be an arbitrary point on the interior 
edge e == fl dtz~ C Sj. Moreover, let v and r be vector- and matrix- 
valued functions, respectively, that are smooth inside each element By 
(v^,r^) we denote the traces of (v,r) on e taken from within the interior of 

respectively. Then, we define the averages at x G e by = (v"b -hv“)/2 
and = (r'b + r“)/2. Similarly, the jumps at x G e are given by |vj = 
v+ <S) n^+ + v“ (g) n^- and |rj = • n^+ + r~ • n^- . For matrices r G 

]^mxn, ^ standard notation a : t = ^kiTki\ 

additionally, for vectors v G R"^, w G R’^, the matrix v(g) w G is defined 

by = VkWi. 

Given a polynomial degree p > 1, we define the finite element space Vh = 
{v G [L2(i7)]^ : G [Qp{f^)]^ , ^ 'Th}: where Qp{n) denotes the space of 

tensor product polynomials on hi of degree p in each coordinate direction. We 
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consider the following interior penalty discontinuous Galerkin discretization of 
equations (2): find G V/j, such that 



h 



A/'(u/i,Vfe) = - f :T‘=(u/,) : V/jV/idx+ -v 

Jn JaK\r 

+ [ VftUft) : V/»Vfedx- /" : |v^J ds 

Jn Jsx 

- [ i{GjidhVh/dxi,Gj2dhVh/dxi)} : |u/i])ds 
JSx 

+ [ r|uhl : IvhJds + A/Hu/j.v/i) = 0 (3) 

JSt 



for all Vh in Here, the subscript h on the operators Vh and dhjdxi^ 
i = 1,2, is used to denote the discrete counterparts of V and djdxi^ i — 
1,2, respectively, taken elementwise. Furthermore, W(-, •, •) denotes a numerical 
{convective) function, assumed to be Lipschitz continuous, consistent and 
conservative. The function E G L^{S) denotes the so-called discontinuity 
penalization function; defining h G L"^{S) by h(x) = min{/i^+ , } when 

X G e = n dtz~ C ^j, and h(x) = h^, when xGe^^/^nTc Sr^ we set 
E = C^/(hRe), with a parameter > 0 that is independent of h. Finally, 
we write 



A/Hu,„Vh)= /'w(u+,ur(u+),n)-v+ds+ [ E {u+ - ur{u+)) ■ v+ ds, 
Jr J jtdu/w 



l 



J^{uh,WhUh) : 



) n ds — / 
Jtv 



P"{uh,Vhnh) : v^J; ® nds 



roUlw.iso -'Av.adia 

[ {Gniu^)dh'V}[ /dxi,Gj2{u^)dh'Vl /dxi) : {u^ -ur{u^)) iS>nds, 

J r\r]si^sup 



where the boundary function ur(u) is given according to the type of boundary 
condition imposed. We set Up(u) = u^:) on /T, ur(u) = u on Fn.sup, and 
ur(u) ^ {p,pvi,pv 2 ,?^ + on TN.sub- Furthermore, we set ur(u) = 

(/9,0,0,pc„T„„y on Tw.i^o, and ur{u) = (/9,0,0, = {p,d,d, pE-\pv‘^Y 

on Fw.adia- Finally, we note that on the adiabatic boundary Fw.adia, we define 
P'^{m, Vu) such that 

/^(u, Vu) • n = (0,Tiini + Ti 2 ri 2 ,T 2 ini + T 22 n 2 , 0 )^. 



Remark 1. We remark that the discretization of the viscous terms has been 
done by employing the symmetric version of the interior penalty method, cf. 
[1], and the references cited therein. In particular, we note that this scheme 
is derived by first re-writing (2) as a system of first-order partial differen- 
tial equations through the introduction of appropriate auxiliary variables. By 
defining suitable numerical fiux functions, these additional variables are sub- 
sequently eliminated; see [1, 8] for details. We note that within this process 




414 R. Hartmann, P. Houston 



the transpose of the matrices Gij^ i,j = 1,2, naturally arise in the definition 
of the DGFEM. Moreover, this is necessary to ensure the adjoint consistency 
of the resulting method which is essential for the approximation of functionals 
of the solution, cf. [1, 8]. As a final remark, we note that the discontinuity pe- 
nalization parameter E must be chosen sufficiently large in order to guarantee 
the stability of the underlying method. 



4 Goal-oriented a posteriori error estimation 

In this section, we shall be concerned with controlling the error in the nu- 
merical solution measured in terms of a given target functional J(-); for a 
detailed discussion, we refer to the review articles [4, 13]. Assuming that J(-) 
is differentiable, we write 

J{u,Uh]U-Uh) = J{u) - J{uh) = [ J'[0u+ {1 - 0)uh]{u-Uh)d0, (4) 

Jo 

where J'[w](-) denotes the Frechet derivative of J(-) evaluated at some w 
in V. Here, V is some suitably chosen function space such that V/i C V. 
Analogously, we write 

>l(u,u/,;u - u/,,v) = 7V'(u,v) - V(u^,v) 

1 

A/'u[<9u+ (1 - 6)uh]{u - Uh,v) de (5) 

for all V in V. Here, A/"u[w](-, v) denotes the Frechet derivative of u f— > A/’(u, v), 
for V € V fixed, at some w in V. We remark that the linearization defined in 
(5) is only a formal calculation, in the sense that A/’u[w](-, •) may not in general 
exist. Instead, a suitable approximation to A/’u[w](-, •) must be determined, for 
example, by computing appropriate finite difference quotients of A/’(-,-), cf. 
[9, 10]. Given a suitable linearization, we introduce the following dual problem: 
find z G V such that 




A4(u,u^; w,z) = J(u,u/i; w) Vw G V. (6) 

We assume that (6) possesses a unique solution. Clearly, the validity of this 
assumption depends on both the definition of A4(u, u^; •, •) and the choice of 
the target functional under consideration, cf. [10]. For the proceeding error 
analysis, we must therefore assume that the dual problem (6) is well-posed. 

Proposition 1. Let u and Uh denote the solutions of {2) and (3), respectively, 
and suppose that the dual problem (6) is well-posed. Then, 

J(u) - J{\ih) < ^ |?7k|, 



(7) 
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where 



Jk JdK\r 

L 
L 



JdK\r 

- W(u+, Ur(u;[), n^)) • w+ ds 



0 (fe 



+ 



X [ {{Gj^dhOJh/dxi, Gj^dhOJh/dxi) : |u/i] - V^u;.)] • cfe 

^ JdK\r 

/ -ur(u+)) -W+& 

J d K,n{rDU Fw) 

[ {r{u+,Whut)-n,)-ujtds 

^ dK,n{rN,sub^rN,sup) 

[ (^(u+ V,u+)-^(u+,V/.u+)) :u;+®n^/5 

J d tzG\r \^^adia 



+ [ {Gj^{'>^^)dhM-l;/dxi,Gf2{u-l)dhU)l/dxi) : 

J dK,c\{r\rM,sup) 



: (u+ -ur(u;J;)) 0n&, 



and ujh = z - Zh for all zh in \h- Here, = -Vh * 4- Vh • 

{uh^V h^h) , ^ € 'Th, denotes the elementwise residual. 

Proof. Choosing w = u — u/j, in (6), recalling the linearization performed in 
(4), and exploiting the Galerkin orthogonality property of the DGFEM, we 
get 



J(u) - J{uh) = J{u,Uh;u-Uh) = A4(u,u/,;u-u/,,z) 

= A4(u, u^; u - u/i, z - Zh) ^ -A/*(uh, z - zh) Vz/, € V^. 



Equation (7) now follows by the divergence theorem together with the triangle 
inequality. 

We end this section by noting that the Type I a posteriori error bound (7) 
depends on the unknown analytical solution to the primal and dual problems. 
Thus, in order to render these quantities computable, both u and z must 
be replaced by suitable approximations. Here, the linearizations leading to 
Ad(u, u/i; •, •) and J(u, u/i; •) are performed about Uh and the dual solution z 
is replaced by a DGFEM approximation z computed on the same mesh Th 
used for u/i, but with a higher degree polynomial. 



5 Numerical example 

In this section we present a numerical example to highlight the advantages 
of designing an adaptive finite element algorithm based on the weighted er- 
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(b) 



Fig. 1. (a) Mach isolines, Ma = z = 1, . . . , 10, of the flow around the NACA0012 
airfoil; (b) Convergence of the error | J(u) — J(uh)\ using each mesh reflnement strat- 
egy 



Table 1. Adaptive algorithm based on the weighted error indicator \tjk\. 



Elements DOF J(u) — J{uh) 






02 


63 


1008 


-1.534e-01 


-1.299e-01 0.85 


9.2e-01 


6.02 


105 


1680 


-4.838e-02 


-1.114e-02 0.23 


7.4e-01 


15.35 


189 


3024 


7.814e-03 


5.574e-03 0.71 


7.9e-01 


100.60 


306 


4896 


-3.819e-02 


-1.296e-02 0.34 


8.1e-01 


21.11 


522 


8352 


-2.107e-02 


-9.249e-03 0.44 


8.5e-01 


40.25 


909 


14544 


-7.686e-03 


-3.746e-03 0.49 


7.9e-01 


102.9 


1512 


24192 


-1.049e-03 


-9.130e-04 0.87 l.Oe+00 


962 


2559 


40944 


-2.628e-04 


-2.395e-04 0.91 


6.2e-01 


2374 



ror indicator |? 7 ,^| in comparison with both uniform mesh refinement, as well 
as an adaptive algorithm based on an empirical refinement indicator which 
does not require the solution of an auxiliary (dual) problem; for simplicity, we 
employ a Type II residual indicator of the form derived in [10]. Throughout 
this section, we employ the Vijayasundaram flux for the discretization of the 
convective terms and set p — 1 (bilinear elements). Finally, for both adaptive 
refinement strategies, we use the fixed fraction refinement algorithm with re- 
finement and derefinement fractions set to 20% and 10%, respectively; we also 
note that for the computation of |? 7 ^|, the dual solution is approximated using 
piecewise biquadratic polynomials. 

We consider a Mach 0.8 flow at an angle of attack a = 10° with Reynolds 
number Re = 73 and constant temperature on the profile, cf. [3, 12]. The so- 
lution to this problem consists of a flow that is mainly subsonic with a small 
supersonic region above the airfoil, see Fig. 1(a). Here, we consider the evalu- 
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Table 2. Nonlinear Newton residuals (Res.) and convergence rates on a sequence of 
uniformly refined meshes. *: On the coarsest mesh only the last 4 out of 6 iteration 
steps are displayed. 



Mesh 1 


Mesh 2 


Mesh 3 


Mesh 4 


Mesh 5 


Res. 


Rate 


Res. 


Rate 


Res. 


Rate 


Res. 


Rate 


Res. 


Rate 


6.5-02* 


- 


8.1-01 


- 


4.4-01 


- 


2.4-01 


- 


1.1-01 




2.5-02 


3 


4.2-02 


19 


3.3-02 


13 


1.6-02 


16 


5.9-03 


19 


6.7-04 


37 


1.2-03 


36 


1.6-03 


20 


2.6-04 


62 


1.4-04 


43 


6.6-07 


1021 


1.4-06 


808 


1.8-05 


92 


2.6-07 


976 


1.7-04 


804 


3.7-10 


1755 


3.1-10 


4538 


2.9-09 


6199 


5.5-11 


4750 


1.7-10 


1021 





(a) (b) 

Fig. 2. Convergence of the nonlinear residual with the number of Newton steps 
employed: (a) Uniform mesh refinement; (b) Adaptive mesh refinement using the 
empirical error indicator 



ation of the inviscid drag coefficient (cdp) on the surface of the airfoil. On the 
basis of a fine grid computation, the reference value of the functional is given 
by J(u) 0.224. 

In Fig. 1(b) we compare the true error in the computed target functional 
J(-) using all three mesh refinement strategies. Here, we clearly observe the 
superiority of the weighted a posteriori error indicator; at all refinement steps 
the error in the computed functional is less than the corresponding quantity 
when either uniform refinement or adaptive refinement based on an empirical 
residual indicator is employed. Indeed, on the final mesh the true error in J(-) 
is over an order of magnitude smaller than |J(u) — J{uh)\ computed on the 
sequence of meshes generated by the empirical indicator. 

In Table 1 we collect the data of the adaptive algorithm when employ- 
ing the weighted indicators. Here, we show the number of elements and de- 
grees of freedom (DOF) in V/^, the true error in the functional J(u) — J(u^), 
the computed error representation formula, the approximate a posteriori error 
bound and their respective effectivity indices 6i = t]k/{J{u) — J{uh)) and 
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02 = — J{'^h)\- First we note that on all refinement steps the 

correct sign of the error is predicted by the computed error representation 
formula. Furthermore, whereas on very coarse meshes the quality of r]^ is 
rather poor, in the sense that 6i is noticeable smaller than one, we see that 
the effectivity indices 6i slowly tend towards unity as the mesh is refined. On 
the other hand, even just the application of the triangle inequality leads to sig- 
nificant over-estimation of the true error in the computed functional; indeed, 
here we see that 62 slowly increases. 

As a final remark, we note that in all of our computations, we employed 
a (damped) Newton iteration to solve the system of nonlinear equations aris- 
ing from the discontinuous Galerkin discretization of the compressible Navier- 
Stokes equations. Within each Newton step, GMRES with Block-Gauss-Seidel 
preconditioning is exploited to solve the resulting linearized problem involving 
the Jacobian matrix, cf. [9], for further details. For each of the three mesh re- 
finement algorithms employed above, the nonlinear solver proceeds as follows: 
starting on the coarsest mesh with free-fiow conditions the nonlinear prob- 
lem is solved by the Newton iteration described above. When the nonlinear 
residual converges below 10“^, the mesh is refined once; then the discrete so- 
lution is interpolated onto the new mesh and is thereby taken as the starting 
solution for the Newton iteration on the newly refined mesh. In Table 2 we 
present the history of this solution process on a sequence of uniformly refined 
computational meshes; these results are also summarized in Fig. 2(a). On the 
coarsest mesh, with free-fiow values, the Newton iteration requires a few steps 
until the iterative solution reaches the range of quadratic convergence. Indeed, 
after only 6 Newton steps the nonlinear residual is below the given tolerance 
of 10“® and the mesh is refined; on subsequent meshes, the Newton iteration 
requires only 4 steps to reduce the residual below the given tolerance. 

The convergence behaviour of the Newton iteration is analogous even when 
locally refined meshes are employed. Indeed, in Fig. 2(b) we plot the nonlinear 
residuals against the number of Newton steps employed for each of the meshes 
generated using the empirical error indicator. Here, we again see that on the 
first mesh 6 Newton steps are required to satisfy the convergence criterion, 
while only 4 are necessary on subsequent meshes. Analogous behaviour is also 
observed on the adaptive meshes generated using the weighted error indicator 
\rj^\] for brevity, these results have been omitted. 
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Summary. We present a novel approach to unstructured tetrahedral mesh genera- 
tion in 3D with minimum geometrical input. The boundary of the domain is identified 
adaptively with adjustable precision. The grid points of the mesh are distributed by 
means of an algorithm based on the analogy with a system of electrically charged 
particles. A-priori local refinements of the mesh are possible. Surface and volume 
meshes are built using advancing- front- type algorithms. All steps of the algorithm 
are suitable for efficient parallelisation. An application of the presented approach is 
shown. 



1 Motivation 

Nowadays, computational mathematics becomes part of new challenging inter- 
disciplinary computer- assisted technologies in medicine, natural sciences, in- 
dustry and elsewhere^. Specifically for mesh generation this means that sim- 
plification and automation of geometrical inputs becomes a crucial issue. Tra- 
ditional algorithms, among which the most popular ones are based on Solid 
Modeling Techniques (SMT) and CAD modellers, have not been designed to 
process machine-generated information such as outputs of MRI scans, image 
recognition devices, physical measurements, geographic databases etc. It is our 
aim to design a fully automatic parallel generator of unstructured tetrahedral 
meshes with this capability. 

The example geometry in Fig. 1 is given by the formula 

0= i^{x,y,z);{x - R{(j))cos{(t))f + {y - R{(j))sin{4>)f +{z - <r‘^{4>), 

(j> e (o,47t)| 



with 

^ The authors acknowledge the financial support of the Grant Agency of the Czech 
Republic under Grant No. GP102/01/D114. 
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Fig. 1. Example of a 3D geometry whose optimal computer representation is a non- 
trivial issue 



W) = I-A. 

The minimum information necessary for the definition of the geometry of a 3D 
domain Q is its characteristic function x? 



_ J 1, X £ Q 



( 1 ) 



In this paper we introduce a first working version of a C++ mesh generator 
XGEN3D that has been designed to process this form of geometrical input. 
The algorithm consists of four steps, 

step 1: adaptive identification of the boundary dQ^ 
step 2: iterative distribution of grid points, 
step 3: generation of surface mesh, 
step 4; generation of volume mesh, 



that will be discussed in the following sections. 

Mesh generation is a scientific discipline with a very long tradition, and due 
to its variety and diversity it is difficult to select a few references on which our 
work is exactly based. One of the best places where mesh-generation-related 
information of virtually any kind can be found, that we used intensively, is 
Robert Schneiders’ web page [3]. 
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2 Adaptive identification of the boundary df2 

In order to keep the computer implementation at a reasonable level of com- 
plexity, the user is requested to provide the following parameters in addi- 
tion to the characteristic function xo'- co-ordinates of a (reasonably small) 
cube C C {inital cube) such that f2 C C and tolerances TOLq^ TOLqq^ 
TOLqq ^ TOLq^ with which the geometry of Q will be approximated. The 
meaning of these parametes will be explained later. Moreover the user defines 
a mesh density function h{x) : C — > R that at every point x G i? indicates the 
desired mean edge length. 

The algorithm produces a continuous, piecewise triangular approximation 
of the boundary dQ. It proceeds recursively, starting by dividing the initial 
cube into eight smaller sub-cubes (hence oct-tree)^ and using the function Xf2 
to determine whether a sub-cube intersects with the boundary dQ or not. The 
adaptive splitting process is continued until the size of the sub-cubes reaches 
the parameter TOLq. The resulting sub-cubes are called basic cubes. Splitting 
of the basic cubes further continues until TOLqq is reached. The resulting 
sub-cubes are called boundary cubes. 

At this point the complete (discrete) information about the boundary dQ 
is contained in the boundary cubes. A simple algorithm allows for accurate 
location of points {boundary points) where the edges of the boundary cubes 
intersect with the boundary dQ. For TOLan sufficiently small, the boundary 
dQ can be locally approximated by a plane. In other words, for each boundary 
cube the corresponding boundary points form a planar polygon with 3 to 7 
vertices Pi^P 2 ,...,Pk- 

If k = 3, we already have a portion of the desired triangular approximation 
of dQ. If /c > 4, we need to construct a triangulation of the polygon. This is 
most easily achieved by calculating its center of gravity, 

Pe = + P2 + . . . + Pfc), 

and defining k triangles (Pi, P2, Pc), (P2, P3, Pc), • • • , {Pk-i,Pk, Pc)- 

Summarized, the recursive algorithm of approximating the boundary dQ 
can be written as follows: 

Boundary identification algorithm: 

1. Begin with the initial cube C. 

2. Split the actual cube into 8 sub-cubes. 

3. With each sub-cube: 

3.1. Does it intersect with the boundary? 

NO: Is the size of its edge smaller than TOLqI 
YES: STOP. 

NO: Continue with step 2. 

YES: Is the size of its edge smaller than TOLan^ 

YES: Build an approximation to dQ inside of the cube and STOP. 
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NO: Continue with step 2. 

4. STOP. 

The output of this algorithm is a piecewise triangular approximation to the 
boundary dQ. The algorithm can analyse as complicated geometry as desired 
and approximate it as accurately as desired. The only limiting factor is the 
amount of memory and CPU-time consumed by the software. Let us point 
out that the resulting set of triangles cannot be used as a surface mesh yet 
- the size and shape of the triangles varies irregularly and their diameter has 
nothing in common with the user-specified mesh density function h{x). 

Fig. 2, 3 and 4 show the approximation of a part of the domain Q from 
Fig. 1 with three different values of TOLqq. 




low accuracy optimal accuracy high accuracy 

Fig. 2. Approximation of di? with various levels of accuracy: TOLan =0.1, 0.03 
and 0.015, respectively 



3 Distribution of grid points 

First we need to estimate the total number N of grid points. With the shape 
of i? known from the previous step, we numerically integrate the mesh density 
function h{x) over the domain. An average edge length is defined as 

_ In 

|/ 2 | • 

Finally we use the information about the volume of an equilateral tetrahedron 
with the edge length ha to compute the approximate number of tetrahedra in 
the mesh and the optimal number of grid points N. 

Next we place N geometrical points with pseudo-random positions into' 
the domain i7. By pseudo-random we mean that the positions are generated 
randomly, but we do not allow any two points to lie too close to each other. 
When all of the N points are generated, we start optimizing their positions. 
For this we adopt the approach [4] that minimizes a suitable global potential ^ 
defined on the set of all grid points. One way to choose ^ is to look for analogy 
with a system of electrically charged particles. From a potential one can derive 
forces that act on the particles, and one can apply a time stepping procedure 
in order to let the system converge to a state with minimum energy. 
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With a suitable choice of the repulsing force Fij of particles Pi , Pj , steady 
state of the system is usually achieved after a few iterations. An example of 
the repulsing force F that respects the variable mesh density h is 

The total force working on each particle P{ is computed from contributions 
of particles that lie in a limited distance from Pi. In this way we use the 
oct-tree structure available from Step 1 to avoid quadratic complexity of the 
algorithm. One further uses the magnitude of forces working at each particle 
together with the mesh density function in order to calculate a suitable time 
step. The time step is defined in such a way that any particle Pi displaces at 
most h{Pi)/D, T) > 0. The default value of the parameter D in the code is 
D — 5. If D is chosen too large, the system will evolve very slowly and one 
will have to perform a large number of iterations before the steady state is 
reached. On the other hand, a too small value of D would cause instabilities 
and prevent the system from converging at all. 

In addition to that one has to withdraw the kinetic part of the total energy 
from the system after each time step by resetting the velocities of all particles to 
zero. This action is analogous to cooling of a physical system that is necessary 
in order to reach the bottom of the potential well. 

When moving a particle, we have to check whether it crosses the boundary 
df2 on its way. If so, we compute the intersection of the boundary with its 
trajectory and position the particle to the point of intersection. In this mo- 
ment the particle becomes a boundary particle and it is not allowed to leave 
the boundary anymore. From now on the particle is being moved along the 
boundary - we compute the force acting on it but only consider the projec- 
tion onto the plane tangential to dQ. The outline of the algorithm is as follows: 

Algorithm for the distribution of grid points: 

1. Compute the total number of grid points N. 

2. Place the points into i? in a pseudo-random way. 

3. Until steady state is reached, repeat: 

3.1. For each particle Pi do: 

3.1.1. Calculate the force Fi acting on Pi. 

3.1.2. Does Pi lie on the boundary dO? 

YES: Compute the projection of Fi to the boundary. 

Move the particle according to the projection. 

NO: Does the last trajectory of Pi intersect with df2? 

YES: Place Pi to the point of intersection. 

NO: Move the point Pi according to Fi. 



4. STOP. 
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Let us remark that it is desired that the mesh density function h{x) does 
not vary too fast, otherwise tetrahedra with obtuse angles would arise in the 
correspondent areas of the domain, which would lead to low quality of the 
resulting mesh. It is difficult to state any quantitative conditions, but we can 
roughly say that the function h{x) should be Lipschitz-continous with the 
Lipschitz constant not greater than 1/3. However, this value depends on the 
domain geometry. Generally speaking, the function h{x) should be chosen as 
smooth as possible. 



4 Surface meshing 

The set of boundary grid points is used to generate a surface mesh - this is 
a standard task that virtually all mesh generators have in common. We apply 
an Advancing Front (AF) algorithm [4] for this purpose. See, e.g., [2] for a 
more general description of AF algorithms. 

Because of some temporary technical difficulties that we hope to overcome 
soon we request the user to partition the surface of the domain into M sub- 
regions ^i?i, ^i? 2 , . . . , of simpler shapes. Function Number ing(x,y,z) 

then returns the index /c of a subregion dQk the point [x, y, z] belongs to. Or- 
dered chains of abscissas that separate a subregion dOk from the rest of dQ 
are called frontiers. 

We generate the surface triangulation for each subregion df2k separately 
using an AF technique. The A;-th frontier is the starting point. We copy all its 
abscissas to the list of abscissas A. Then, for the first abscissa (P, Q) in A, we 
search a boundary grid point Z = [x,y,z] such that Numbering (x , y , z) = and 
the angle (P, Z, Q) of abscissas (P, Z) and (Q, Z) is maximal. In other words, 
we look for the maximizer Z of the angle {P^Z^Q) in the set of all bound- 
ary grid points such that Number ing=n. The fact that the angle (P, Z, Q) is 
maximal ensures that there is no boundary grid point lying inside the triangle 
{P,Q,Z). If the mesh density function h{x) is Lipschitz-continous with rea- 
sonably small Lipschitz constant and we have reached an equilibrium steady 
state in the process of iterative grid points distribution, we can always find 
a maximizer Z such that the triangle (P, Q,Z) obeys the minimal- angle-rule 
and this results in a high quality mesh. 

When the point Z is found, we add the newly created triangle into the list 
of surface triangles Ts and remove the abscissa (P, Q) from the list A. Then 
we check if the abscissas (P, Z) and (Q, Z) are already contained in A; if so, 
we remove them from the list. In the opposite case we add them to A. We 
repeat the search of a maximizer for the next abscissa in the list, until the list 
becomes empty - this means that the subregion df?k is covered with a surface 
triangulation. We can sketch the algorithm as follows: 

Surface triangulation algorithm: 

1. For each subregion dHk C dO: 
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1.1. Copy its frontier abscissas into a list A. 

1.2. Repeat until A is empty: 

1.2.1. Find a boundary grid point Z C df?k that maximizes 
the angle [P^Z^Q) for the first abscissa (P^Q) G A. 

1.2.2. Add the triangle (P, Q, Z) to the surface mesh r^. 

1.2.3. Remove the abscissa {P,Q) from A 

1.2.4. Is abscissa (P, Z) already contained in A7 
YES: Remove it from A. 

NO: Add it to A. 

1.2.5. Is abscissa {Q,Z) already contained in A7 
YES: Remove it from A. 

NO: Add it into A. 

2. STOP 

The result of this algorithm is the surface triangulation list r^. Examples of 
two different surface meshes on the domain i? from Fig. 1 are shown in Fig. 3. 

Obviously the quality of the mesh strongly depends on the distribution 
of grid points. In general, it is extremely difficult to give a proof of stabil- 
ity of the meshing algorithm or statements about the quality of the mesh. 
According to our experience, if the disribution of grid points is reasonable, 
the meshing algorithm is stable and produces meshes of very good quality, 
containing prevalently almost-equilateral triangles. 



5 Volume meshing 

We use a generalization of a two-dimensional algorithm [4] for the construction 
of the volume mesh Ty . The input parameters are the list of surface triangles 
and the set of inner grid points. The idea of the volume meshing algorithm is 
similar as in the case of the surface mesh, again based on the AF technique. 

We start with a first triangle (P, Q, P) in the list r^. For this triangle we find 
a grid point Z such that the sum of the angles (P, Z, Q) -h (Q, Z, P) -h (P, Z, P) 
is maximal. According to our experience, this criterion turned out to be op- 
timal for the meshing algorithm. Next we add the newly created tetrahedron 
(P, Q, P, Z) into the list Ty of tetrahedra and delete the triangle (P^Q^R) from 
the list Ts. We check if the triangles (P, Q, Z), (P, P, Z) and (Q, P, Z) are con- 
tained in Ts. If so, we remove them from the list. In the opposite case we add 
them to the list. Then we proceed to the next triangle in and so on, until 
Ts becomes empty. The algorithm is written as follows: 

Volume meshing algorithm: 

1. Repeat until is empty: 

1.1. Find a maximizer Z for the first triangle {P,Q,R) G r^. 

1.2. Add the tetrahedron (P, Q^R^Z) into the list Ty. 

1.3. Remove the triangle (P^Q^R) from Tg. 
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Fig. 3. Meshes on the example geometry from Fig. 1. The first one is gradually 
refined towards the thinner end of the spiral and has 1460 surface elements. The 
other consists of 1872 uniformly-sized elements. In both cases geometry analysis 
parameters TOLq = 0.15 and TOLaa = 0.03 were used. 
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1.4. Is triangle (P, Q, Z) contained in Tg? 

YES: Remove it from Tg. 

NO: Add it to Tg. 

1.4. Is triangle (P, P, Z) contained in Tg? 

YES: Remove it from Tg, 

NO: Add it to Tg. 

1.4. Is triangle (Q, P, Z) contained in Tg? 

YES: Remove it from Tg. 

NO: Add it to Tg. 

2. STOP 

According to our experience, the algorithm is stable and it produces tetrahe- 
dral meshes of very good quality. 



6 Parallelisation 

All algorithms utilized for the adaptive oct-tree analysis of the boundary dQ, 
distribution of grid points and generation of the surface and volume meshes 
are extremely well suited for running in parallel. Since we are in the middle of 
parallelization of the code, let us only give a brief description of the techniques 
we use. 

6.1 Parallel adaptive identification of the boundary df2 

The initial cube C, i? C C, is split into eight subcubes in the initial step. The 
algorithm is then run on each of the subcubes recursively, fully independently 
of the other instances. Since the resulting subdomains of Q are disjoint, ne 
can run eight instances of the boundary analysis algorithm without any risk of 
potential collisions. If there are more than eight CPUs, obviously it is possible 
to start the parallel run on further levels of recursion. 

6.2 Parallel distribution of grid points 

When distributing the grid points, the most time-consuming part of the algo- 
rithm is the computation of the total force acting on the particles. With Ncpu 
processors at our disposal, one divides Q into Ncpu subdomains. No domain 
decomposition method is needed since splitting the initial cube C does the 
job. An independent instance of the algorithm is run for each of the resulting 
subdomains. 

6.3 Parallel generation of surface and volume meshes 

The surface of the domain is again split into subregions of simpler shapes in 
the same spirit as in Section 4. The construction of the surface triangulation 
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in every subregion is a task independent of events in the other subregions. 
For the sake of technical simplicity, in our conception the number of CPUs is 
limited by the number of physical areas. This simplification will be eliminated 
later. 

A similar idea as above is applied to the volume meshing - the initial cube 
is split and independent instances of the volume meshing algorithm are run on 
the subcubes. After the algorithm finishes independent runs in the subcubes, 
one final sweep is done to generate tetrahedra that lie across the boundaries 
of the subcubes. Since these are not many, a single process is sufficient to 
accomplish this task. 
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Summary. We present recent results using adaptive finite element methods, based 
on a posteriori error estimates, to compute various output functionals for incom- 
pressible fiow problems in 3d, for both laminar and turbulent fiows. The a posteriori 
error estimates are based on the solution of an associated dual problem with data 
connected to the output functional we want to compute. 



1 Introduction 

We present recent results from [15, 14], extending earlier results in [17, 13], 
where we use adaptive finite element methods, based on a posteriori error esti- 
mates^ to compute various output functionals in incompressible flow problems 
in 3d, for both laminar and turbulent flows. The a posteriori error estimates are 
based on the solution of an associated linearized dual problem that contains 
information about error propagation in space-time. 

The idea of using duality arguments in a posteriori error estimation goes 
back to Babuska and Miller [2] in the context of postprocessing ’quantities of 
physial interest’ in elliptic model problems. A framework for more general sit- 
uations has since then been systematically developed by in particular Eriksson 
Sz Johnson and Becker Sz Rannacher, with coworkers, see e.g. [6, 4, 19, 20]. 
Applications to incompressible flow have been increasingly advanced with com- 
putation of functionals such as the drag coefficient for 2d stationary benchmark 
problems in [3, 9], and drag and lift coefficients and pressure differences for 3d 
stationary benchmark problems in [15]. In [17] time dependent problems in 3d 
are considered, and the extension to Large Eddy Simulation LES of turbulent 
flow is investigated in [13]. In [14] a temporal mean of the drag coefficient of 
a surface mounted cube in a turbulent channel flow is computed using a LES. 

If we use a subgrid model in a LES, the subgrid modeling error is included 
in the a posteriori error estimates, which opens the possibility of comparing 
the error using different subgrid models. Altogether, the a posteriori error esti- 
mates open the possibility of adaptively choosing both an optimal mesh and an 
optimal subgrid model. This approach to a posteriori error estimation with re- 
spect to the averaged solution, using duality teqniques, in terms of a modeling 
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error and a discretization error was developed for convection-diffusion-reaction 
equations in [10, 18, 16, 11, 12]. Related approaches with a posteriori error es- 
timates in terms of a modeling and a discretization contribution to the total 
error have been suggested. For example, more recently in [5] similar ideas are 
presented with applications to 2d convection- diffusion-reaction problems. 

Due to the local nature of turbulence in many applications, in particular 
for the problem in [14], we stress the possibilities of adaptive mesh refinement 
for such problems. In [14] we are able to locally resolve scales of motion cor- 
responding to a Reynolds number of about 1000, and in theory we would be 
able to locally resolve scales corresponding to a Reynolds number of about 
10^, using an ordinary PC or laptop computer. 



2 Turbulent flow and LES 

The incompressible Navier- Stokes equations for a spatial domain i? C take 
the form: 



u-\- {u • V)u — vAu + Vp = /, V • = 0, in i? X /, (1) 

where u{x^t) = [ui{x^t)) is the velocity vector and p{x,t) the pressure of the 
fluid at (x,t), / is a given driving force, i/ is the kinematic viscosity^ and 
I = (0,T) is a time interval. We assume that (1) is normalized so that the 
reference velocity and typical length scale are both equal to one. The Reynolds 
number Re is then equal to 

For low Re we may have time independent solutions that satisfies the sta- 
tionary Navier-Stokes equations, where we simply drop the time derivative in 
(1). For higher Re we have time dependent solutions, and for sufficiently high 
Re we get turbulent solutions. 

In a turbulent flow we are typically not able to resolve all scales of motion 
computationally. We may instead aim at computing a running average u^ of 
u on a scale h, defined by 

= u{x + y,t)dy, (2) 

" jQh 

where h = h{x, t) is a parameter related to the local resolution of the problem 
and Qh — {y ^ < h/2). In the LES literature it is common to define 

the averaging operator through convolution by a certain Alter function, and 
there is a multitude of filter functions being used. Though we only consider 
the case of the Alter corresponding to (2) in this paper, the teqniques for 
a posteriori error estimation are general and apply to other Alters, possibly 
with modiAcations for commutation errors associated with such Alters. 

By an extension of (u,p, /) to by reflection for all x ^ Q, the averaging 
operator (2) commutes with space and time diflerentiation. If we take the 
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running average of the equations (1), corresponding to a LES, we obtain the 
following equations for u^\ 

+ {u^ • V)u^ — vAu^ + + V • T^{u) = /^, V • = 0, in i? X /, 

(3) 

where = (uiUj)^ — u^u^ is the Reynolds stress tensor. The closure prob- 
lem of LES is how to model r^{u) in terms of in a subgrid model In 

this paper we focus on the computation of chosen output functionals for the 
problem (3) using adaptive finite element methods, and we refer to [8, 24] and 
the references therin for work on subgrid modeling for LES, and we refer to 
[14] for details on the mathematical formulation of the problems (1) and (3). 



3 Adaptive finite element methods 

An adaptive algorithm includes feed-back from computation to achieve the 
computational goal with minimal computational cost. In an adaptive finite 
element method this feed-back from computation relies on a posteriori error 
estimates. 

In [15, 14] we compute approximations g{U^ P) of functionals g{u^p), where 
(U,P) is a numerical approximation of (tx,p), and we prove a posteriori error 
estimates of the form 



\g{u,p)-g{U,P)\< ^ ^Ic, (4) 

K£Tk 



where 



■'K 









(5) 



is an error indicator for element K in the mesh 7*, with R't residuals, and 
dual weights from the solution of an associated linearized dual problem, at 
iteration k. An adaptive algorithm for computing approximations g{U^P)^ to 
a tolerance TOL, then takes the form: 



Algorithm 1 (Adaptive mesh refinement) Start at k = 0, then do 

(1) compute approximation to the primal problem on Tk 

(2) compute approximation to the dual problem on Tk 

(3) ■/ E < TOL then STOP, since \g{u,p) — g{U,P)\ < TOL, else 

KeTk 

(4) refine a fixed fraction of the elements in Tk with largest 

(5) set k = k P 1, then goto (1) 
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4 Numerical examples 

We now present two different applications of Algorithm 1 to incompressible 
flow in 3d; first a stationary flow from [15], and then a turbulent flow from 
[14]. For details of the computations and the a posteriori error estimates we 
refer to [15, 14]. 

4.1 Stationary benchmark problems in 3d 

In [25], computational results for a collection of benchmark problems for 
laminar flow around a cylinder in 2d and 3d are presented, with contribu- 
tions from 17 research groups. We consider the case of 3d stationary flow 
around a cylinder with square cross-section D x D, with D = 0.1, cen- 
tered at (0.5,0.2,0.205) aligned in the a: 3 -direction, in a channel of dimen- 
sions 2.5 X H X iJ, with H = 0.41. We have no slip boundary conditions 
on the cylinder and the channel walls. At the outflow boundary we use a 
transparant outflow condition, see [23], and the inflow condition is given by 
u( 0 ,X 2 ,X 3 ) = { 16 UmX 2 {H — X 2 )x 3 {H — xs)/H^^ 0 , 0 ). The kinematic viscosity 
is u = 10“^ and Um = 0.45, which gives a Reynolds number Re = UD/u = 20, 
with U = 4/7(0, iJ/2,iJ/2)/9. 

We consider the computation of the drag coefficient, and the computation of 
a pressure difference upstream and downstream of the cylinder, using a cG(l) 
method (continuous piecewise linear trial and test functions) on tetrahedral 
meshes for both the primal and the dual problems. 

To evaluate the performance of the duality based error indicator (5) as 
a refinement criterion in Algorithm 1, we compare with a commonly used 
alternative error indicator 



pk 



K 






( 6 ) 



where || • ||i^ is a norm on the element AT, based only on the size of the residuals, 
coupling to energy estimates (see e.g. [1]). 



Computation of the drag coefficient The computational goal is to ap- 
proximate the drag coefficient defined in [25] by 



^ 2 Fp{u,p) 



(7) 



where Fp{u^p) is the drag force on the cylinder. Based the results in [25] we 
choose cp = 7.6 as our reference value. 

In Figure 1 we compare the convergence rates of the two error indicators 
(5) and (6) with respect to the reference value cp = 7.6. It is obvious that the 
refinement criterion (5), based on both the residual and the solution to the 
dual problem, does a better job than the refinement criterion (6), solely based 
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Fig. 1. Convergence rates for the computation of the drag coefficient cd (left), and 
the pressure difference Ap (right), for duality based refinement (’o’) and residual 
based refinement (’*’), as a log- log plot of number of unknowns versus relative errors. 



on the residuals without any information from the dual problem relating the 
residual to the error in the drag coefficient cd- 

We then evaluate the a posteriori error estimates as a stopping criterion 
for the adaptive algorithm by introducing the notion of an effectivity index 
/e//, defined by 

leff = estimated error /true error ^ (8) 

and in Table 1 we present leff as a function of the number of unknowns. The 
a posteriori error estimates in this case are quite sharp. After a few initial 
refinements the error estimates is off by less than a factor 2, and may thus be 
useful as a stopping criterion. 



Table 1. Effectivity indices leff = estimated error /true error for computing the 
drag coefficient cd (left), and the pressure difference Ap (right), as functions the 
number of unknowns. 



#dof 


CD : leff 


5.656 


3.36 


7.456 


13.54 


11.996 


4.32 


18.336 


2.53 


33.120 


2.26 


62.252 


1.41 


116.616 


1.27 


225.588 


0.92 


436.444 


0.76 


844.956 


0.66 



#dof 


Ap: leff 


5.656 


0.85 


8.620 


0.99 


14.044 


0.62 


21.636 


0.71 


36.872 


0.88 


67.412 


0.92 


80.392 


1.11 


222.756 


1.14 


426.612 


1.20 


797.940 


1.08 
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Computation of a pressure difference Next we consider the problem 
of computing the pressure difference in two points upstream and down- 
stream of the cylinder respectively, defined by Ap = p{x^) — with 

x'^ = (0.45,0.20,0.205) and = (0.55,0.20,0.205). Based on the results in 
[25] we use Ap = 0.176 as a reference value. 

In Figure 1 we compare the error indicators (5) and (6), and we find that 
the duality based approach again is the better. In Table 1 we present effectivity 
indices for the a posteriori error estimates, which we find to be quite sharp 
with lef f close to unity. 

4.2 Turbulent flow around a surface mounted cube 

We now use Algorithm 1 to compute the temporal mean of the drag coefficient 
C£) over a time interval I — [Tq,T], with To = 10 and T = 20, defined by 

T 

CD = f CD{t) dt, (9) 

\d - Jo| Jto 

where coit) is the drag coefficient at time t for a surface mounted cube in 
a turbulent channel flow, using a cG(l)cG(l) method (continuous piecewise 
linear s in space- time) for both the primal and the dual problem, on tetrahedral 
meshes Tk that we choose to be constant in time for each iteration k. In the 
definition of the LES for the adaptive step k we let h = h{x) be defined to be 
the piecewise constant function that equals the diameters of the finite elements 
in the computational mesh Tk. 

In our computational model we use the Navier-Stokes equations to model 
the incompressible fluid around a cubic body of dimension H x H x H that sits 
on the floor of a rectangular channel of length 15iJ, height 2i?, and width 7H, 
centered at (3.5iJ, 0.5ii7, 3.5iJ). At the inlet we use a velocity profile interpo- 
lated from experiments, we use no slip boundary conditions on the body and 
the vertical boundaries, slip boundary conditions on the lateral boundaries, 
and a transparent outflow boundary condition. The viscosity v is chosen to 
give a Reynolds number Re = UbH/v = 40.000, where we have used Uh = 1.0. 

We use no subgrid model in the computations, but we use the following 
scale similarity subgrid model 

( 10 ) 

from [22] to estimate the modeling residual^ see [14], measuring the small scale 
influence on the resolved scales. 

In Figure 2 we plot the mean drag coefficient as a function of number of 
degrees of freedom. We And that even though we do not reach full convergence 
using the avaliable number of degrees of freedom, the value for the mean drag 
coefficient seems to asymptotically approach a value between 1.45-1.5. We 
know of no experimental reference values of but in [21] cjo is approximated 
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Fig. 2. Mean drag coefficient cd over the time interval [10, 20] as a function of the 
number of degrees of freedom (left), discretization error en {'o') and modeling error 
CM (^*0 after 13 adaptive mesh refinements as functions of the length of the time 
interval [To,T], with T fix and To varying, assuming Uh(To) = u^{Tq) (middle), 
and a posteriori error estimates of the discretization error cd {'o') and the modeling 
error cm (^*0 time interval [ 10 , 20 ], as functions of the number of degrees of 

freedom in a logio-logio plot (right). 



computationally. The computational setup is similar to the one in [14] except 
the numerical method, the length of the time interval, and that we in [14] use 
a channel of length 15iJ, compared to a channel of length lOH in [21]. Using 
different meshes and subgrid models, approximations of cd in the interval 
[1.14, 1.24] are presented in [21]. 

The diameter of the smallest element in the mesh T 14 is about 10“^ (with 
= 0.1), which corresponds to a local Reynolds number Reioc ^ { 2 H/h)^^^ 
1200 (with channel height 2i7), using standard Kolmogorov arguments of tur- 
bulent flow [7], or Reioc ~ h~^ = 1000, assuming the numerical viscosity of the 
cG(l)cG(l) method is acting as a term /i(Vt//i, VU/^). That is, we are locally 
able to resolve scales corresponding to a Reynolds number of about 1000, even 
though it would be impossible globally to a similar computational cost. Since 
turbulence often is a local phenomena, adaptive methods are ideal for compu- 
tation of turbulence. In theory, if we refine the same elements in each step of 
the algorithm we would get a finest h ^ H x (1/2)^^ 10“^, corresponding to 

Reioc ~ 10^. That is, we would be able to locally resolve flows corresponding to 
a Reynolds number of 10^ in a Direct Numerical Simulation using an ordinary 
PC or laptop computer. 

After 13 adaptive mesh refinements we plot the a posteriori error estimates 
of the discretization and the modeling errors in Figure 2 as functions of the 
length of the time interval [Tq,T] (T fix. To varying), where we have assumed 
that the initial solution is exact for each To, so that Uh{To) = i^^(To). We find 
that the error at first increases with the length of the time interval, but when 
the interval exceeds a certain length the error does not increase significally 
beyond a certain level, and thus the computational cost of computing Cjo is 
relatively constant for time intervals longer than a certain length. 

In Figure 2 we also plot the discretization and the modeling errors for cd 
over the time interval [10, 20] as functions of the number of degrees of freedom. 
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where we note an expected decrease in the estimates of the discretization error 
as we refine the mesh. We also note that the estimates of the modeling error on 
the other hand increases. This might at first seem alarming, but is in fact to be 
expected since in this case we have used the simple model (10) to estimate the 
Reynolds stresses in the modeling residual. Even though the true Reynolds 
stresses are smaller for a finer resolution h of the problem, the model (10) 
will in fact first increase as we resolve more scales of motion since it is solely 
based on the resolved velocity fiuctuations on the scale 2h. This is of course a 
problem, and in a continuation of this study we seek sharper estimates of the 
Reynolds stresses based on scale extrapolation. 

Remark 1. The use of a stabilized Galerkin finite element method in the com- 
putations may be viewed as a type of subgrid model in itself, since we then 
in fact solve a modified set of equations using a standard Galerkin method. 
We will further investigate this relation between numerical stabilization and 
subgrid modeling in a continuation of this work. In this paper we only con- 
sider the stabilization to be part of the numerical method and not an explicit 
subgrid model. 



5 Summary 

In this paper we have presented results from [15, 14], extending earlier results 
in [17, 13], where we use adaptive finite element methods based on a posteriori 
error estimates to compute approximations of output functionals in incom- 
pressible fluids, for both laminar and turbulent flow. The a posteriori error 
estimates are based on the solution of an associated linearized dual problem, 
and are used as error indicators for the adaptive mesh refinement algorithm. 

In the problem of computing the mean drag coefficient in a turbulent chan- 
nel ffow, we emphasize the local nature of turbulence that makes adaptive 
methods ideal for efficient and accurate computations. Due to the computa- 
tional goal of approximating the mean drag coefficient we refine the mesh 
according to the corresponding a posteriori error estimate, resolving scales of 
motion corresponding to local Reynolds numbers of about 1000, and in the- 
ory we would be able to resolve local scales of motion corresponding to local 
Reynolds numbers of the order 10^ to a similar computational cost. 

In continuations of this study we will address methods for sharp estimation 
of the modeling residual based on scale extrapolation, as well as adaptive 
strategies to combine numerical stabilization with subgrid modeling for LES. 
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Summary. We consider a nonlinear parabolic partial differential equation that de- 
scribes the evolution of the surface morphology in the deposition of thin glassy films 
by molecular beam epitaxy. The dynamics of the growth process exhibits some un- 
expected initial linear behavior, before the nonlinear dynamics sets in. Therefore, 
for the numerical solution we suggest a combined spectral element /finite element 
approach. Results of numerical simulations are given that show a good agreement 
with experimental measurements. 



1 Introduction 

We consider the deposition of thin glassy films on the surface of substrates 
such as silicon by molecular beam epitaxy. Such processes play an important 
role in materials science with regard to the coating of surfaces in order to 
obtain specific surface properties (cf., e.g., [11]). 

In particular, we assume that the particle beam is impinging perpendicularly 
to the surface of the substrate (cf. Fig. 1). 

Denoting by Q := [0,L]^ C IR^ the surface of the substrate, the deposition 
process can be described by the temporal and spatial distribution of the height 
profile u(o:,t), a: E i7, ^ > 0, as given by 

u{x,t) = H{x^t) — Ft ^ (1) 

where H{x^t) is the absolute height and F refers to the deposition rate which 
is assumed to be constant. 

As far as the development of an appropriate mathematical model is concerned, 
the deposition evolves according to 

Ou 

— (a;,i) = g{u{x,t)) , xeQ,t>Q (2) 

where the right-hand side g describes the surface growth. There have been 
many attempts to establish appropriate models for the morphology of deposi- 
tion processes featuring amorphous surface growth (cf., e.g., [1, 4, 13]). Here, 
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Particle Beam 




Film H(x,t) 
Substrate (e.g., silicon) 



Fig. 1. Schematic representation of the deposition of thin films 

following [5, 10] and [11], we take three major growth mechanisms into ac- 
count: 

The first one describes surface growth due to particle interaction 

gi{u) ;= - 5 (1 + Au , (3) 

where S stands for the influence of interatomic and van der Waals forces. 

The second one is curvature induced surface relaxation 

g 2 (u) := - D V [ (1 + |Vm|2)-i/ 2 y ((1 + Au) ] . (4) 

Here, D refers to the material dependent diffusion coefficient. 

Finally, the third mechanism is due to structure coarsening in the sense that 
particles at locations with high gradients of the height profile move to locations 
with lower gradients. Here, we follow the model suggested by Moske (see [11]) 

gs{u) := C [ (1 + |V«|2)-i/ 2 y (1 + ] , (5) 

where C denotes the mean surface mobility. 

As long as the particle beam impinges perpendicularly onto the surface of the 
substrate, we have |Vr^| <C 1. In this case, the functions gi = gi{u)^l < i < 3, 
in (3), (4), and (5) simplify to 

gi{u) := ai A'^u , g 2 {u) := 02 Au , g 3 {u) ;= as zi(|VwH , 
where a^<0, l<z<3. The evolution equation takes the form 
du 

— = A{aiu + a 2 ^u + aajVi^P) in Q := i? x [0, oo) (6) 

with an initial condition u{x, 0) = uq{x) , x e f7, and either periodic boundary 
conditions or homogeneous Neumann boundary conditions on T = df2. 
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The constants ai,a 2 and as in the evolution equation are usually determined 
by parameter identification with respect to experimentally obtained measure- 
ments by using Auger spectroscopy and scanning electron microscopy . 
However, if the particle beam does not impinge perpendicularly, there are 
overhangs in the profile and even topological changes due the formation 
of inclusions. In this case, we must use the original form of the functions 
9i = ^ 1 < ^ < 3, as given by (3), (4), (5), and resort to other techniques 

as, for instance, level set methods (cf., e.g., [8, 9]; see also [3]). 

We note that the nonlinear 4th order evolution equation (6) resembles the 
well-known Cahn-Hilliard equation which describes spinodal decomposition, 
i.e., phase separation in binary alloys (cf., e.g., [6]). 

The paper is organized as follows: In section 2, we will briefiy address the 
dynamics of the growth process which features some unexpected initial linear 
behavior. This motivates the use of a combined spectral element /finite ele- 
ment approach for the numerical solution of the nonlinear evolution equation 
(6) that is described in sections 3 and 4. Finally, in section 5 we will give some 
simulation results in terms of visualizations of the height profile for different 
film thicknesses. 



2 Dynamics of the growth process 

The solution of the nonlinear evolution equation exhibits some unexpected 
initial linear behavior. This can be explained by an appropriate decomposition 
of the spectrum cr C IR of the associated linearized operator which is self- 
adjoint and sectorial. In particular, we specify three constants 

7“ < 0 < 7*^ < 7'^'^ < 1 

such that, referring to Xmax as the maximum eigenvalue, the spectrum is de- 
composed into the four parts 

^ • — ( *^7 7 XrYiax) 5 ^ (7 X^fYiax XfYiax) •> 

•— (' 7 ~^ XjTq,ax 7 Xjjiax) 7 := X^nax^ 

We further denote by X , X~ , and X~^'^ the subspaces spanned by 
the corresponding eigenfunctions: 

X := span {(/?(A) | A € a } , X~ := span {v^(A) | A G cr~} , 

:= span {<^(A) | A G cr^} , X~^~^ span {v^(A) | A G . 

Then, the direct sum of and X~^~^ can be shown to be a dominant sub- 
space which determines the dynamical behavior of solutions to the nonlinear 
evolution equation in the following sense: 
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Fig. 2. Illustration of the initial linear behavior of the solution 

Theorem 1. Assume G H^{f2) with 

IK,0)||2,x 2 < r- := , a>0 , 

and set 

:= J dx . 

n 

Then there exists t* > 0 such that the solution u = u{x^t),x G G [0,t*), 
of (6) stays with probability 1 in a vicinity of the dominant subspace 

Y := u^ + X+0X++ (7) 

until at t = t* it leaves a ball Br{ 0) with radius R> r. 

Proof For a proof of this result we refer to [2] . 

Figure 2 illustrates the initial linear behavior of the solution in case uo = 0. 
We note that a related result for the Cahn- Hilliard equation has been estab- 
lished in [12]. 



3 Spectral Galerkin approximation 

For the spectral Galerkin approximation we consider the weak formulation of 
the implicitly in time discretized nonlinear evolution equation which involves 
the Sobolev space V := in case of periodic boundary conditions and 

V if homogeneous Neumann boundary conditions are imposed. 

Using the backward Euler scheme and denoting by u'^ G U an approximation 
oi u{'^tm) at time tm and by tm — tm-i the time step from level m — 1 

to level m, the problem is as follows: Find u^ G V such that for all x ^ 
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ju""xdx — j w™ ^xdx + Tm j [aiu'^ ■^r a2Au”^ + f {vr^y^Axdx , (8) 

n n n 

where the nonlinearity f{u^) is given by f{u^) := a 3 |Vw"^p. 

We refer to Aj , 1 < z < AT, as the first N eigenvalues of — and denote by 
V/v := span {(fi | 1 < i < n} the finite dimensional subspace of V spanned by 
the associated orthonormal eigenfunctions. 

The spectral Galerkin approximation is then a linear combination = 
N 

^ G Fiv SO that (8) with V replaced by V/v gives rise to the non- 

k=l 

linear system 



N 

I 

+ I'm A, 



(9) 



j dx -u^/ 



Ki< N 



Q 



k=l 



The solution of that nonlinear system by Newton’s method would be quite 
expensive, since it requires the computation of the Jacobian at each iteration 
step. It turns out that it is sufficient to use the method of successive iterations 
which corresponds to the approximation of the original problem by the semi- 
implicit Euler Scheme: For u >0 compute G Vjsf as the solution of 



j Un’^Xn dx = j u’^ ^XN dx + Tm j + (10) 

on o 

+a 2 Au'^''' + f{u^’''~''-)]AxN dx , xn & Vn , 

where := In this case, we can explicitly solve for the components 

of the new iterate 



where 






^N,i 

1 CL\Tqm^i Oj2'^m^i 



1 <i< N 



(11) 







U 



N,k 



(fk) (-Pi dx 



l<i<N 



We only have to evaluate the nonlinear terms which can be efficiently 

done by the Fast Fourier Transform in case of periodic boundary conditions and 
by the Fast Cosine Transform for homogeneous Neumann boundary conditions. 
In both cases, this is done with respect to an equidistant grid consisting of 
grid points where M has to be chosen larger than the dimension of the trial 
space V/v in order to avoid aliasing effects. 




Numerical solution of a nonlinear evolution equation 445 



The semi-implicit Euler scheme is carried out with an automatic step-size 
control. The error due to the time discretization is estimated by 

7ll«- - vr\\ < \\u{tm)-u'^\\ < r\\ir - u^\\ , 

where u'^ G F is the solution of the semi-implicit trapezoidal rule 

j vTxdx^ j u'^-'^xdx + Y y + ( 12 ) 

i? i? f2 

+ a2Z\(u'”+w"‘-') + if{u"^) + f{u'^-^)))]Axdx, x^V . 
UV — Vn, the solution of (12) can be easily computed according to 

^ [ai A, + iu^-%) - 

- A? (({i^), + - A, (/r + /r“')] , 1 < * < iV • 

In the practical realization of the step-size control, we additionally take into 
account the error due to the discretization in space. The error \\u'^ — u'^\\ 2 ,n 
is caused by the negligence of those eigenmodes i > N that have not been 
considered in the spectral Galerkin approximation, but are relevant for the 
exact computation of the nonlinearities. Therefore, for sufficiently large P, we 
set 



= E 

N<j<N+P 



^ofT 



1 + «i r„ 



\j (3-2 '^m 



Given a tolerance tol >0, we check for convergence: 






iVl|2,i7 



+ 



-1 



1 ) 






< tol 






(13) 



where (j < 1 is an appropriate weighting factor. If (13) is satisfied we proceed 
with the new time-step 



j(T tol \\u^h,n - |en-| 

■“ V llw^ - uTrh.n 



Otherwise, we repeat the previous time-step with r^. 



(14) 



4 The finite element method 

The spectral Galerkin method becomes inefficient when the nonlinear dynam- 
ics sets in, i.e., when the solution leaves the dominant subspace Y as given by 
(7). Although a theoretical bound for the exit time t* is known (cf., e.g., [2]), 
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this bound is an overestimation and hence not practicable. Therefore, we stop 
the spectral approach and switch to a finite element method, if the convergence 
test (13) fails for several consecutive time- steps. 

The finite element approximation is based on a reformulation of the 4th order 
equation (6) as a system of two 2nd order equations 



dujdi — Aw 

w — a\u -h a 2 Au 4- aalVup 



in Q := i7 x [0, oo) (15) 



We discretize in time by the implicit Euler method and in space by continuous, 
piecewise linear finite elements with respect to a simplicial triangulation Th of 
Q. Denoting by Si{Q\Th) the associated finite element space, for each time- 
step we have to solve the nonlinear system of equations: 

Find G Si{Q\Th) x S\{Q]Th) such that 



/ 



u 



m— 1 
h 



'Tm 



Xh dx + 



J • Vxh dx = 0 , Xhe Si{n;Th) , 

n 



ai 



J dx - U2 / dx -h as y* dx - 

Q Q O 



- J w^i)K dx = 0 , V'/, G Sxifl'.Th) . 

Q 



(16) 

(17) 



Providing a hierarchy (7^Jf=o triangulations, we solve the nonlinear system 
(16), (17) on the finest grid by Newton-Multigrid: 

Given G 5i(i?;7^) x Si{Q\T()^ we compute 






I cT) 

= Up’ + 






rriM I srm.i' 

= w^^ + K; 



where the Newton increment solution of the linear system 



J Kl''Xtdx + Tmj VCr • ^Xedx = J u^-Xedx 

Q Q Q 

- j u'^’''xedx - Tm j 



(18) 



-Vxtdx , , 



J - a, j S^’^'^edT + a2 ^ - (19) 

Q Q n 

- J f{uP’'')S^^’'^'ipedx = - J w'P’''ipidx + oi J uj’’'' ip idx - 

Q Q Q 

- 0.2 j VuJ’’’' -Wipedx + j f{uj’’")ipedx , ipe, & Si{Q\Thp . 

Q n 
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The system (18), (19) is solved by linear multigrid using incomplete LU decom- 
position both as smoother on all levels 1 < z < ^ as well as an iterative solver 
for the coarse grid correction equation on level z == 0. 

For the finite element approach we use a similar step-size control as in case 
of the spectral Galerkin method, except that we replace the estimation of the 
error due to the discretization in space by a residual-type a posteriori error 
estimator. 



5 Simulation results 

We have used the combined spectral element /finite element approach for the 
numerical simulation of the deposition of the metallic glassy film ZrAlCu on 
silicon substrates. 

The computational domain Q has been chosen as a square of length L 200nm 
in each direction. Periodic boundary conditions have been imposed and the ini- 
tial height profile izo(x) , x E Q has been determined randomly with fZo = 0. 
In the spectral element approach we have used 125, 200, and 250 modes per di- 
mension, whereas for the computation of the nonlinearities by the Fast Fourier 
Transform a uniform grid with M = 400 grid points in each direction has been 
employed which is sufficiently large to avoid aliasing effects. 

In the finite element method we used a hierarchy (T^)^^o simplicial 

triangulations with ho = 1/25 and /i 4 = 1/400. 




Fig. 3. Computed height profile for different film thicknisses [100 nm (left), 360 nm 
(middle), and 480 nm (right)] 

Figure 3 displays the computed height profiles for different film thicknesses in' 
a grey scale ranging from black (0 nm) to white (4 nm ) . One clearly observes 
the effect of structure coarsening: a surface pattern with a mesa-like structure 
evolves featuring hills with flat plateaus that are separated by narrow deep val- 
leys. We note that already for 125 modes per dimension the computed profiles 
are both qualitatively and quantitatively in good agreement with experimen- 
tally obtained data (cf., e.g., [11]). 
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Summary. A new numerical algorithm for solving semilinear elliptic problems is 
presented. A variational formulation is used and critical points of a -functional 
subject to a constraint given by a level set of another (7^ -functional (or an intersection 
of such level sets of finitely many functionals) are sought. First, constrained local 
minima are looked for, then constrained mountain pass points. The approach is 
based on the mountain pass theorem in a constrained setting. 



Weak solutions of semilinear elliptic partial differential equations can typ- 
ically be represented as critical points of nonlinear functionals. The easiest 
approach in constructing critical points is via a minimizing sequence. It leads 
to the method of the steepest descent which yields local minimizers. Another 
tool used to prove existence of critical points is the mountain pass theorem 
of Ambrosetti and Rabinowitz [1], repeated in Sec. 1. Choi and McKenna 
[4] introduced a method based on a constructive form of this theorem - the 
mountain pass algorithm. It is able to find numerical approximations to criti- 
cal points of mountain pass type (typically, saddle points at which the second 
derivative of the functional has exactly one negative eigenvalue, i.e., there is 
just one direction in the function space at which the functional decreases on 
both sides of the critical point). 

The main objective of the work presented in this contribution is to de- 
sign a numerical method (constrained mountain pass algorithm) that can find 
numerical approximations of more complicated saddle type critical points (typ- 
ically, with more negative eigenvalues of the second derivative of the functional, 
i.e., more “directions of decrease”). A similar question was posed in [5]. The 
“high-linking algorithm” presented by the authors can, however, be only ap- 
plied to a narrow family of problems. The current work presents a different 
approach that appears to be more universal. 

We illustrate the main idea on a simple example that was used in both [4] 
and [5]: 



—Art == 
u = ^ 



in i? = (0, 1) X (0, 1) , 
on dQ . 



( 1 ) 
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Weak solutions of this problem correspond to critical points of the functional 

defined on ifo(i7). It is not difficult to see that u = 0 is not only a solution of 
(1) but also a local minimum of I. Since the steepest descent method would 
likely yield this trivial solution (or diverge since / is not bounded from below) , 
a different method needs to be used to approximate nontrivial solutions. In [4] 
a mountain pass solution (Fig. 1(a)) was found numerically using the moun- 
tain pass algorithm. Later, the high- linking algorithm of [5] provided a more 
complicated saddle type solution (Fig. 1(b)). 




Fig. 1. Solutions of (1). Approximate interval of values of u(f2): (a) (0,6.62], 
(b) [-12.88,14.98], (c) [-16.24,16.22], (d) [-12.93,19.75] 



The idea of the current approach is to “reduce” the number of the “direc- 
tions of decrease” of the functional by introducing constraints on admissible 
functions. Roughly speaking, we could expect to reduce mountain pass points 
to constrained local minima, or more complicated saddle type points to, for 
example, constrained mountain pass points. 

Define a constraint given by a new functional J. Let S = {u G Hq{0) \ 
{0} I J{u) := [|Vixp — dx = 0}. By testing (1) with u we find that all 
nontrivial weak solution of (1) belong to S. Instead of looking for critical points 
of I we will look for critical points of I with respect to 5, i.e., we will solve 
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I'{u) — \J'{u) = 0, where A G R is a Lagrange multiplier. It can be shown [6] 
that any solution (u,A), u ^ 0 of this equation satisfies A = 0, hence u also 
solves (1). It should be noted that it is not always possible or even desirable to 
find constraints, such that the Lagrange multipliers vanish. As applications in 
Sec. 3 show, these multipliers can be a part of the formulation of the problem. 

In the constrained setting, the solution in Fig. 1(a) can also be found as a 
local minimum of I with respect to the constraint S by the method of Sec. 2.1. 
Similarly, the solution of [5] in Fig. 1(b) can also be found as a mountain pass 
point of I with respect to S by the method of Sec. 2.2. The solutions shown in 
Figs. l(c,d) are also constrained mountain pass points found by our method, 
they, however, were not found by the method of [5]. Hence already this simple 
example shows advantages of the current approach. But it is the problems to 
which the high-linking method cannot be applied (e.g., those in Sec. 3) that 
make the approach unique. 

The outline of the rest of the paper: Section 1 presents a summary of known 
theoretical results - the mountain pass theorem and its constrained version. 
In Section 2 a description of the constrained steepest descent method (CSDM) 
and the constrained mountain pass algorithm (CMPA) is given. Finally, Sec- 
tion 3 shows the application of the method to two problems which cannot be 
handled by the high- linking algorithm of [5] : a second order problem with two 
constraints, a fourth order problem on an unbounded domain. 



1 Theoretical Background 

The mountain pass algorithm of [4] is based on the classical mountain pass 
theorem of [1]. The constrained mountain pass algorithm presented in this 
contribution is based on the constrained mountain pass theorem. We review 
both theorems in this section. 

Let B be a real Banach space and I G C'^(B,R) a continuously Frechet 
differentiable functional. 

Definition 1. I satisfies the Palais-Smale condition at the level a G R i/ any 
sequence {un} C B such that I {un) — > a andl'{un) 0 possesses a convergent 
subsequence. 

Theorem 1 (mountain pass). Let ei,e2 be two distinct points in B. Define 

c= inf max I(u) , 

7Gr uG7([0,l]) 

where F == {7 G C([0, 1],B) | 7(0) = ei,7(l) = ^2}. 

If c > max{/(ei), 1(62)} and I satisfies the Palais-Smale condition at the 
level c, then c is a critical value of I, i.e., there exists u G B such that P{u) — 0 
and I{u) = c. 
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Let us now introduce k constraints. Assume that Ji G i G 

{1, . . . , /c}. We are interested in finding real numbers Ai, . . . , and a function 
u £ B that satisfy 

k 

I'{u)-'£XiJ'{u)=0. (3) 

Such a function u is a critical point of I with respect to the set 

S = {ue B\ Ji{u) = 1 , . . . , Jk{u) = 1} . 

Equation (3) is a general formulation of problems that we want to solve nu- 
merically in this contribution. We assume further that if u G 5, then Jl{u) ^ 0 
for all i and are linearly independent. 

The constrained mountain pass theorem [6] is based on the work of Bon- 
net [2]. In order to use his definition we denote ||/'|5^ || == infa^, 

T,i=iaiJl(u)\\ and = inf„2+,..+„2^i || ELi 

Definition 2. I and Ji, . . . , J/. satisfy the Palais- Smale condition at the level 
a G R if for any sequence {un} C B such that Ji{un) !,•••, Jk{un) — > 1 and 
I{un) — > a and either ||/^|5^^|| —> 0 or tg{un) — > 0 there exists a convergent 
subsequence. 

Theorem 2 (constrained mountain pass). Let ei ^ C2 belong to a path- 
connected component of S. Define c by (2), where T = {7 G C([0, 1], 5) | 7(0) == 
ei,7(l) = 62}. 

If c > max{/(ei), 1(62)} and I and Ji,...,J/c satisfy the Palais-Smale 
condition at the level c, then c is a critical value of I on S, i.e., there exists 
a solution of equation ( 3 ) with u G S and I{u) = c. 



2 Description of the Algorithm 

2.1 Constrained Steepest Descent Method 

Let from now on B = iJ be a Hilbert space. In order to apply Theorem 2, 
convenient points 61,62 G S need to be found. Choosing 61 and 62 as local 
minima of / on 5 seems reasonable. They can be found using a constrained 
steepest descent method. 

The method solves numerically the following initial value problem: 

^C(i) = , C(0) = Co G 5 , (4) 

where VI{u) is defined as the Riesz representation of the Frechet derivative 
I'{u) and Pu is the orthogonal projection on the tangent space of 5 at u G 5 
and is given by 
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k 

PuV = V — Jj{u) V ^ H ^ 

where the coefficients aj solve the linear algebraic system 
k 

Ji{u), VJj{u))aj = J'i{u)v for i G {1, . . . , k}. 

j=i 

Properties of (4) are studied in [6]. If C is a solution, then C{t) G 5 on its 
interval of existence and /(C(0) i^ ^ nonincreasing function of t. 

Equation (4) is discretized in two steps: 

Step 1 Choose Atn > 0 small, define = Un — V/(u^). 

Finding the gradients of I and Ji, . . . , J/c at usually means solving a lin- 
ear partial differential equation by a convenient numerical method (e.g., finite 
element method). 

Step 2 - Projection Since iZn+i lies in the tangent space of S at Un but not 
necessarily on S we need to approximate it by some Un +1 c s. This is usually 
accomplished by scaling (cf. examples of Sec. 3). 

The size of Atn in Step 1 is chosen to be smaller than a prescribed maximum. 
After Step 2 we check whether I{un-\-i) < I{un)‘ If not, we halve the size of 
Atn and repeat both steps. If the value of I cannot be decreased any further, 
we stop. 

2.2 Constrained Mountain Pass Algorithm 

Assume we work in a finite dimensional approximating subspace of H (conve- 
niently chosen in 2.1). We take a discretized path in S connecting Ci and 62, 
two local minima obtained by the constrained steepest descent method. We 
find the maximum of I along the path. The point at which the maximum oc- 
curs is moved a small distance in the direction of the projection of the steepest 
descent of / to the tangent space to 5, and then projected back to S. Hence the 
path has been deformed in S and the maximum of I lowered. The deforming 
of the path is repeated until the maximum along the path cannot be lowered 
any more - a critical point is reached. 

Path Initialization A path in S connecting ei and 62 is represented by 
a collection of P points zq, ... ,zp G S. There is no general rule for choosing 
these points but in many cases the following obvious choice suffices: 

Zj = -^{62 — ei) jG{0, ...,P}, 

Zj G aS is an approximation of zj as in Step 2 of 2.1. 




454 J. Horak 



Main Loop First, find the maximum of I on the path, i.e., find jm with 
H^jm) — Use interpolation to improve the maximum by moving zj^ 

closer to Zj^^i or 

Second, update Zj^ by moving it in the direction of the steepest descent of 
I projected to the tangent space of S at Zj^ to decrease the value of I{zj^). In 
fact, this amounts to the application of the two steps of the constrained steepest 
descent method 2.1 (with Zj^ instead of Un, Un^-i is then the updated zj^). 
Repeat these two steps until one of the following occurs: 

1. the value of I(zj^) cannot be decreased any further, 

2. in several recent consecutive repetitions of the loop the index jm has always 
been the same - infinite loop. 

In the first situation \\Pzj^\/I{zj^)\\ = II small, i.e., Zj^ is an 

approximation of the desired critical point, the algorithm stops. In the second 
situation the path needs to be refined. 

Refining the Path We prescribe a number of points to be inserted between 
Zj^ and Zj^±i (for example two). At the same time we remove the same 
number of points from both ends of the path so that the number P of points 
on the path stays the same. 

It is sometimes useful and more efficient to design a Newton scheme for problem 
(3) and to use the solution obtained by CMPA as an initial guess for this 
scheme. The approximate constrained mountain pass solution does not need 
to be very precise. This means we can stop the CMPA early and do not need 
to refine the path that many times. 



3 Examples 

3.1 Fucik Spectrum of the Laplacian 

—Au = — vu~ in C 

u = 0 on dQ ^ ^ 

with f^u^dx = 1, where i? = {{xi,X2) G | ^2 > 0 , 0:2 < 2 — 40 : 1 , 0:2 < 
2 + 4o:i} is an isosceles triangle with base 1 and height 2, = max{dzu,0}. 

A point (/i, u) G M? is called a Fucfk eigenvalue if (5) has a weak solution u G 
Hq{Q). The Fucfk spectrum is then the collection of all Fucfk eigenvalues. It 
has been studied for various domains both analytically and numerically in [8] . 
The numerical investigation was based on a continuation method starting at 
some known solution. Such a solution can be numerically obtained by CSDM 
and CMPA. 

For t G (0, 1) define a variational problem with two constraints: 
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I{u) — f \Vu\‘^dx , Ji{u) = f [u^)‘^dx , J 2 {u) — [ )“^dx , 

Jo Jn Jo 

S = {u e Hq \ Jl{u) = t, J2{u) = 1 —t} . 

Critical points of / on 5 are solutions of (5) with (/i, u) obtained as Lagrange 
multipliers. It is shown in [6] that I and Ji, J2 satisfy the Palais-Smale condi- 
tion at any level. 

We apply the steepest descent method of Sec. 2.1. The scaling in the pro- 
jection step is performed in the following way: for u G ^ 0^u~ 7^ 0 

find > 0 such that u = — t^u~ G S. Figure 2 shows two local 

minimizers of / on 5 for t = 0.2. 





Fig. 2. Solutions of (5) found by CSDM. Approximate values of (/i, and u(i7): 
(a) (52.6,47.8), [-2.50,1.66], (b) (65.4,34.3), [-2.19,1.91] 



These local minimizers are used as endpoints of the path in CMPA. The 
algorithm converges to the solution in Fig. 3(b). During the run of CMPA the 
size of \\I'\s^. II = \\Pzj^ is checked and if it is small for a number of 

iterations but later grows again, we may use the point Zj^ as an initial guess 
in Newton’s method. If this method converges, we obtain a numerical solution 
that is most likely different from the one to which CMPA eventually converges. 
The solution in Fig. 3(a) was obtained this way. 

A finite element method with piecewise linear functions on a triangular 
grid with 11, 097 nodes and 21, 760 triangles was used, the path in CMPA had 
P = 50 points. 

3.2 Fourth Order Problem in 



A‘^u + d-u-\- g{u) = 0 



u 



00 
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Fig. 3. Solutions of (5) found by CMPA. Approximate values of (/i, and u{Q)\ 
(a) (103.2,45.8), [-2.55,1.90], (b) (105.7,43.4), [-2.41,1.84] 



where g{u) = e^ — 1 — G (0,2). It has been studied in [7], its solutions 
represent traveling waves in the direction of the xi-axis with speed c in a model 
of a nonlinear ly supported plate. The authors present a variational existence 
proof based on the mountain pass theorem for a certain class of nonlinearities 
g. Weak solutions of (6) in — W^’^(R^) are constructed as critical points of 
a functional. Since the functional does not satisfy the Palais-Smale condition 
of Theorem 1 , additional work has to be done in order to recover some form of 
compactness. This turns out to be difficult for the above mentioned exponential 
nonlinearity and hence the proof does not cover this case. 

The numerical results of [7] give, however, a strong evidence of existence 
of mountain pass solutions even with the exponential nonlinearity. Moreover, 
a comparison is made with a one-dimensional ODE version of problem (6), 
for which a wide variety of numerical solutions was computed by a shooting 
method [3]. Hence a question arisen whether there is a way of finding more 
numerical solutions of the two-dimensional PDE problem than just those in [7] 
found using the mountain pass algorithm. We will show that the constrained 
mountain pass algorithm yields additional numerical solutions. 

Define functionals /, J G C^(iJ^,R) by 




where C > 0 is a constant. Critical points of / on 5 are weak solutions of (6) 
with obtained as a Lagrange multiplier. Let the inner product on be 
= [(A(/))(A'0) + (/)'0]dx. It is shown in [7] that the corresponding norm 

is equivalent to the standard norm on 
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Although the domain is the whole plane for the numerical purposes 
we can work on a large enough (but bounded) rectangle because our solutions 
decay to zero as \x\ — > oo. Further, any translation of a solution of (6) is also a 
solution with the same values of I and J. These translations can be prevented 
by assuming symmetries of solutions (as in [3, 7]): u{xi^X 2 ) — u{—xi^X 2 ) 
and u{xi^X 2 ) = u{xi, —X 2 ) V(xi,X 2 ) C R^. Hence we can work on a rectangle 
[— 0] X [— 0]- On the boundaries xi = —Ki and X 2 = —K 2 the conditions 
= 0 and Aia = 0 are implemented, on x\ — 0 and X 2 = 0 the symmetry 
conditions. 




Fig. 4. Solutions of (6) found by CSDM: (a) c ~ 1.247, (b) c ^ 1.313 



CSDM with C = 150 yields numerical solutions shown in Fig. 4 (the profiles 
of the waves have been highlighted, only a part of the computational domain is 
shown) . These types of solutions were found in [7] by the unconstrained moun- 
tain pass algorithm. The projection step (Sec. 2.1) in the numerical methods is 
again accomplished by scaling: for u G iJ^\{0} find t > 0 such that u = tu e S. 

The two numerical local minima can be then used as end points of the 
path in the constrained mountain pass algorithm. The algorithm converges to 
the solution shown in Fig. 5(b). As noted in Sec. 3.1 already, by stopping the 
algorithm early, if || stays small for a number of iterations, and by 

applying Newton’s method, a new numerical solution may be obtained - here 
the function in Fig. 5(a). 

A finite difference discretization was used with Ki = 70, K 2 = 50 and the 
step size Axi = Ax 2 — 0.2. The number of points on the path in CMPA was 
P = 50. 
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Fig. 5. Solutions of (6) found by CMPA: (a) c 1.384, (b) c 1.365 
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Summary. Intake port shape affects the quality of an engine significantly. In this 
paper, we present a method for improving an existing geometry with evolutionary 
algorithm type optimization. Characteristic parameters of different port shapes are 
calculated with a self-developed CFD program. However only small deformations al- 
lowed on original design, significant improvement achieved. Proposed robust parallel 
evolutionary algorithm seems to be a suitable for other optimization problems on 
heterogeneous non-reliable cluster of workstations. 



1 Introduction 

Design parameters (geometrical structure, injection parameters, etc.) of Diesel 
engines affects the value of it in a very complex way. A large developing ef- 
fort is taken into optimizing these parameters by engine designers. Modeling a 
complete engine is a very difficult task both experimentally and numerically, 
therefore a complex optimization can not be performed for the whole sys- 
tem. Hence the traditional way is to split the engine into several main parts, 
find some parameters that characterizes the value of each parts and perform 
optimization processes only on these parts separately. However, after these 
optimizations we have to put the parts together and check the whole system 
whether the independently optimized parts can be assembled together or not 
and check the overall behaviour of system. 

One of these main parts is the intake port system. Its geometry highly 
affects the main parameters (e.g. power, efficiency, pollution emission) of the 
engine. (See Fig. 1 for illustrations of intake port shape.) The essential reason 
of this strong dependence is that geometry of intake port biases the amount 
and initial velocity distribution of drawn in air into the cylinder. The amount of 
fresh air influes the maximum mass of burnt fuel in one cycle, while the initial 
velocity distribution affects the air-fuel mixture formation process which has 
a high impact to the quality of burning. 



This paper was supported under the Hungarian Grant for Scientific Research 
OTKA T43177. 




460 A. Horvath, Z. Horvath 




Fig. 1. An opaque view of surface and 




a cross section of intake port shape 



It is obvious that the more air drawn in is better. The dependence on 
velocity distribution is more complex, but the engineering practice gives us a 
plausible aspect: the most important property of velocity distribution in the 
cylinder is the angular momentum per unit mass which characterizes the global 
rotation. Both two small and too high global rotation are wrong for efficient 
air- fuel mixture formation. There are commonly accepted optimal rotation 
ranges for different type of engines coming from experiments. 

The traditional way to describe this two important attribute of an intake 
port is to calculate two non-dimensional parameters: the flux-coefficient and 
the swirl-coefficient, denoted by C/ and Cg respectively; for a more detailed 
definition see Section 2.2. 

Our task was to improve an existing intake port geometry, namely enhance 
Cf and keeping Cg near to the original value. Because of avoiding a complete 
redesign of existing and working engine head structure, only small deforma- 
tions of the original geometry were allowed. However, this constraint causes 
that only small improvement is to be expected, even 1% improvement in C/ 
can be important in engineering point of view. 



2 The air flow in the intake port 

2.1 CFD calculations 

In the core of optimization process we needed a reliable and accurate CFD 
software which can calculate the two characteristic flow parameters for port 
shapes that are derived from original shape with small deformations. We used 
a self- developed software based on a classical FVM method with some im- 
provements for this purpose. A detailed description can be found in [3]. Here 
we give only a short summary of the properties: 
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— FVM method with flux vector splitting (Viyajasundaram-type, [7]) 

— solving compressible Euler or Navier- Stokes equations 

— conservative in mass, momenta and energy 

— explicit (first order) time steps 

— local time-steps for faster calculations 

— uses unstructured, conform tetrahedral mesh 

— capable to handle small deformations of mesh 

A tetrahedral mesh with 147775 elements was generated with a CAD- 
software and was used as the discretization of the original shape. (See Fig. 2.) 




Fig. 2. Cut of tetrahedral mesh and stationary velocity field near the valves 



2.2 Characteristic parameters 



Intake ports are characterized in the following standard way: 

Let the pressure constant at inlet and outlet, (jpin and Pout) Let us measure 
the mass flux (m) and the flux of angular momenta (T) at outlet (in the 
cylinder) in stationary case. 

The characteristic parameters of intake ports are the flux coefficient (C/) 
and the swirl coefficient {Cg): (see [4]) 



C/ = 



ml po 

Avo ’ 



C, = 



8T 

mB Vo 



( 1 ) 



where po is the density of air at the inlet, A is the approximate area of 
smallest intake cross section (two times the valve inner seat area), vq = 
\/2{pin — Pout)/ Po is the characteristic velocity based on pressure drop and 
B is the cylinder bore. 
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The larger Cf the better, while values of Cs has an “optimum interval” 
coming from engineering experience. (See the introduction.) 

rh and T can be calculated from numerical model by approximating the 
following integrals on outlet face {Sout)' 

m = I pvdS, T = / p(v X (r — ro))dS 

JSout JSout 

where ro is an arbitrary point on symmetr axis of cylinder, p and v are the 
calculated density and velocity of air. 

We found that calculations with this model can reproduce the experimen- 
tally determined characteristic parameters within acceptable relative difference 
even in the case of non- viscous calculations. For 4 different values of pout we 
could reproduce the measured values of C/ in less than 1%, Cg in less than 
10% relative difference. (See [3] for details.) Six small deformation on port 
shape was realized and measured experimentally also: similar correspondence 
was observed between computed and measured parameters. 

Therefore we used our self- developed CFD code for evaluation of deformed 
port shapes. 



3 The optimization strategy 

The main steps of finding a shape optimization strategy in our problem were 
the followings: 

1. Find and parametrize a set of small deformations. 

2. Find a suitable object function which measures the quality of a deformed 
shape. 

3. Find and implement method to maximize the object function. 

In the following subsections we will go through these steps. 

3.1 Parametrization of small deformations 

We used local, smooth deformations on the original grid. This way we did 
not have to generate tetrahedral mesh for each deformed shape. Instead of 
this very time consuming step we modified only the coordinates of vertices 
and recalculated the geometrical parameters (volume, face normals, etc.) of 
tetrahedra. 

The overall deformation of the shape was put together from elementary 
deformations. Each elementary deformation shifted the vertices only inside a 
specified cylinder with a parallel axis with surface normal. (See Fig. 3.) The 
shift of vertices is a fourth order polynomial of the distance from the axis of 
cylinder and the depth relative to deformation center. 
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Fig. 3. A smooth local deformation on a simple tetrahedral mesh 



This choice fits to the engineering requirements also: we could assure that 
the deformations keep small and keep some places (e.g. the valves, the top of 
the cylinder) unchanged. 

We used combinations of such elementary deformations. This way parame- 
trization of deformations consists from a list of parameters of elementary defor- 
mation, i.e. a list of deformation centers, radii and size. (Height of deformation 
cylinder is not a free parameter: it does not affect the shape but the tetrahe- 
dral mesh. We have to choose its value carefully depending on deformation 
parameters.) 

3.2 The object function 

Our goal is to maximize Cf while keeping Cg close to the original value. There- 
fore we calculated the parameters with no deformations and used them as 
reference values and 
Our object function was: 



Ohj 



Cf 



DM 



Cs 



c\ 



ref 



(2) 



Namely we have a one- variable object function with a penalty term de- 
pending on change in Cg. We found D — 1.0 as adequate in our case. Thus 
10% change in Cg matches 1% change in Cf. (Remember that Cg has no strict 





464 A. Horvath, Z. Horvath 



optimal value rather an optimal interval, that is a few percent change is not 
significant.) For example: Ohj = 1.02 means at least 2% improvement in mass 
fiux. (Depending on swirl number change.) 

During the CFD calculations we observed significant “noise” in Cf and Cg. 
For example we ported our C code to two different hardware architectures and 
found a difference in the order of 10“^ in C/ and 10“^ in Cg. Similar differences 
appeared on the same architecture in the case of very small deformations. 
This noise derives from the rounding errors and the nature of calculations: the 
convergence to the stationary flow is slow in Cg. 

It means that object function has a lot of false local maxima as deformation 
parameters vary. For this reason we decided to use a genetic type algorithm 
for optimization. It is a common decision in shape optimization problems. (See 
e.g. [6]) 



3.3 Optimization method 

Genetic (GA) or evolutionary (EA) type algorithm have a lot of variants. (See 
[2], [8]) Depending on the problem and the hardware possibilities different 
versions of GA/EA should be used. 

The main peculiarities of our problem and possibilities are the following: 

— The workstations we can use have different CPU-speed. 

— Calculation time of object function (a complete CFD simulation) depends 
on geometry and takes 3-4 ours on our fastest workstations. 

— We can use approximately 20-30 workstations for a few weeks therefore 
2000-4000 object function evaluations is possible. 

— There is significant probability of hardware errors during the calculations. 

It is obvious that a classical GA/EA with master- worker type paralleliza- 
tion is not suitable in our circumstances. One reason is the significant proba- 
bility of hardware errors other is the different evaluation time. Both can lead 
to a significant loss off efficiency due to the synchronization stages. 

Furthermore, an island type parallelization with one island per workstation 
is not a good choice either, because each workstation can calculate only 2-4 
full generation with reasonable population size. Since new chromosomes from 
other islands can be included only at the end of a complete generation cycle, 
frequency of communication between different subpopulations is too small for 
efficient parallelization. 

A Robust Parallel Evolutionary Algorithm The specialties of our prob- 
lem and hardware possibilities led us to a special type of parallelization strat- 
egy. The main idea is to use a separate population on each workstation with an 
evolutionary algorithm which tries to send and get messages about evaluated 
chromosomes after each object function evaluation and if it got a new evaluated 
chromosome from other machine, incorporates it into the population immedi- 
ately. The communication is executed in two stages: the EA-process writes the 
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evaluated chromosomes and object function values to the local hard disk and 
checks it for evaluated chromosomes arrived from other machines. “Good” 
chromosomes are moved between EA-units by “agent” programs which are 
running independently from EA-processes. 

This way we achieved stability against hardware failures and strong con- 
nection between the EA-units with zero synchronization loss. It is clearly seen 
that if an EA-unit fails, the other can work further, if the connection between 
units is lost (network error), the units can continue on evaluating object func- 
tions and if the workstation executing the agent program fails the agent can be 
restarted on an arbitrary workstation. Even after a global failure the EA-units 
could read back the evaluated chromosomes and continued working. However 
each failure decreases the efficiency, the whole calculation will not stop and 
after reparation the calculation can continue at full capacity. 

Certeanly we need a special EA algorithm to guarantee that the new results 
coming from other EA-units are used as soon as possible. In a classical GA/EA 
cycle it is not possible. Therefore we were searching for a more flexible approach 
and found the concept of “Flexible Evolution Agents” (FEA-s, see [9]). 

FEA-s has not a strictly prescribed sequence of different genetic operators 
but a central “decision engine” decides about which genetic operator (muta- 
tion, crossover, etc.) will be executed in the next step. This kind of flexibility 
is used to get adaptivity property of EA: a learning engine collects statistics 
about success of operators and decision engine uses this information to choose 
the operators to be execute next time. (See [9]) 

We did not implemented the adaptivity of FEA because each EA-unit 
executes only 100-200 genetic operators in our case and it is too small for 
a reliable statistics. However the non-deterministic order of genetic operators 
allowed us to read and use the results of object function evaluation of other 
units provided by communication agent program. 

The skeleton of a robust and parallel EA-unit: 

1. P opul at ion= empty 

2. sort the population with niching 

3. if ( there are new chromosomes on disk ) 

read new values 

4. if ( Population. size < Size_min ) 

generate and evaluate a random chromosome 
— > step 2 

5. if ( Population. size > Size^max ) 

truncate Population 

6. if ( the best of Population changed ) 

try one line search step between 
old and new best value 
— > step 2 

7. find a new chromosome by a random elementary step 

8. evaluate the new chromosome 
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9. add/replace new chromosome to Population 

10. — > step 2 

The possible elementary steps we used: 

— mutate a randomly chosen chromosome 

— single point crossover between two different chromosomes selected with bi- 
nary tournament 

— examine a random chromosome: if it is closer to a better chromosome by 
a specified distance then mutate it (constrained mutation) 

The set of elementary steps was pieced together using the experience of 
test calculations on classical test problems. Hill climbing was not used because 
it would hinder to immediately read and use results of other units. Instead, 
we implemented step 6 and “constrained mutation” (see above) which are 
searching near the existing good chromosomes and affect similar property. 

The probability of elementary steps is chosen so that in a long calculation 
the number of them will be equal with the number of such steps in a classical 
G A- algorithm. 

Test results We performed several test calculations with different agent 
strategies on classical test problems. (Rastrigin and Keane- functions, see [5]) 
At first we present results with a very simple agent strategy, called “uniform 
distribution” which means that the agent program collects the best two chro- 
mosomes from each unit and sends them to all the other units periodically. We 
found that the time period of collecting and distributing best results should 
be more than 5-10 object function evaluation time but should be less than one 
tenth part of total time of calculations. Within this wide range we found that 
the convergence of best object function value does not depend on best results 
redistribution frequency. 

On Figure 4 we present the results of 20 variable Keane- function problem 
with 1, 4 and 16 E A- units with uniform distribution strategy. One can observe 
decreased efficiency in 16 E A- unit case. Such a decay is a well known property 
of parallel algorithms. 

We tested the robustness of our parallel EA-strategy in test problems. In 
a test calculation with 16 E A- units we randomly stopped the agent program 
for significant intervals to simulate network error. It is clearly seen that during 
the “network errors” increasing best object function values slowed down, but 
when the communication was restored, the results became rapidly increasing. 
This shows that with no communication the E A- units was working further 
producing a high variety of chromosomes and when the communication was 
repaired, the units could use them to produce good entities. (See Fig. 4 on 
right.) 

To circumvent the efficiency loss mentioned above, we divided EA-units into 
5-10 element groups and used uniform distribution within them. An another 
agent program was applied to realize the communication between groups. This 
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Fig. 4. Test calculations with 20-variable Keane- function. 



multilevel strategy has a close relationship with island model; in our case 
a group of E A- units matches with an island. 

In the real CFD shape optimization problem we could use at most 32 
workstations. Following the multilevel agent strategy, we divided them into 
3 groups (A, B and C) with approximately equal members and implemented 
a non-symmetric data flow at the top level agent program: the best results 
of B and C groups was sent to group A periodically. This way group B and 
C evolved separately from other groups which is a good strategy to maintain 
diversity, while group A could combine all the best chromosomes. 



4 Results of intake port optimization 

4.1 About technical aspects 

Both CFD and FA code was written in standard C. The agents were Bourne 
shell scripts. The calculations were performed on Linux workstations at 
Szechenyi Istvan University. The maximum number of workstations was 32. 
The CPU-speeds were between 1.5 and 2.4 GHz. 

A typical evaluation took 3-4 hours (on 2.4 GHz Pentium4 machines) 
There were 6 signiflcant hardware-problems during calculations. (Power- 
outs, hard disk problems, network problems, etc.) This fact proves that the 
robustness of FA method had critical importance in our case. 



4.2 Preparation 

Based on previously mentioned calculations the most sensitive parts of intake 
system geometry were chosen. (E.g. large pressure gradients on surface indicate 
large resistance, high velocity values near the boundaries indicates important 
parts, etc.) 
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Small deformations of sensitive parts were selected with 5-90 mm radius 
and (— 15)-(+15) mm maximum size. 

The goal of optimization process was to find the optimal deformation size 
values in fixed locations with fixed radius. This way we used floating point 
values of deformation sizes as genes. 

4.3 Calculations 

In the first calculation 20 deformation points were used. On each EA-unit 
Size.min=40, Size.max=120 were set. The optimal value we found in 2000 
function evaluation was 1.016. However it is significant, we wanted to get 
a higher improvement. 

The study of optimal chromosomes had important conclusions: 

— There were 7 genes where the optimal value was at lower or upper limit of 
that deformation size (extremal point) 

— There were 5 genes where the absolute value of optimal deformation size 
was less than 1 mm (irrelevant points) 

Using this result a new set of deformations with new limits of deformation 
sizes was chosen and a completely new optimization process was started: The 5 
irrelevant points was dropped and 3 new (hopefully not irrelevant) was added. 
The limits for the 7 extremal points was modified. We present the results of 
the optimization of these 18 parameters. Figure 5 shows the object function 
values during the calculations. One can observe very small improvements in 
best values and decreasing diversity at the end of calculations. These symptoms 
indicate that there is no reason to continue the calculations further. 





Time in days 



Fig. 5. Object function values during optimization process 



The best chromosome we found has object function value 1.024. It means 
a 2.5% improvement in C/ and 3% decrease of Cg. In practice it may result in 
e.g. aproximately 2.5% extra power with similar quality of burning. 
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We present the most important differences between original and optimal 
shapes on Figure 6. The changes are plausible, but the deformation sizes cannot 
be figured out by hand. 





gray: original shape, mid gray: unchanged parts 



We examined the flow in optimal shape and found significant differences. 
For example, the pressure gradient at surface decreased notably. 



5 Conclusions 

The robust parallel EA-method appeared to be useful in non-reliable hardware 
circumstances. On a further work we will focus on optimizing EA-units and 
agent strategies. 

With small deformations (less than 12 mm size) a significant improvement 
was achieved. With a larger set of deformations some further improvement to 
be expected, but a much higher improvement probably requires major redesign. 

We can conclude that our system is suitable for real 3D compressible fluid 
flow shape optimization tasks. 
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Summary. In this paper we present a numerical algorithm to the solution of the 
equations of compressible nonviscous fluids on domains with moving (translating) 
boundaries. Our moving mesh algorithm, deflned on special tetrahedral meshes, 
avoids global interpolation and re-meshing, thus it works quite efficiently on problems 
with strongly deforming domains. As an illustration we give some computational re- 
sults when our algorithm is applied to the simulation of gas flow in a high-voltage 
circuit breaker. 



1 The engineering problems and the scope of the paper 

The simulation of many industrial processes requires numerical solution of 
compressible fluids on moving domains. For example we can mention airfoil 
oscillation, mixture formation in the combustion chamber and the cylinder of 
a Diesel engine and flow development in a circuit breaker. With several prob- 
lems the flow domain is strongly compressed and/or stretched in a certain in- 
terval of the simulation time. So, to avoid small time step-sizes due to distorted 
cells, re-meshing of the flow domain and, accordingly, interpolation of the state 
variables are necessary from time to time, at least in case of explicit methods. 
However, global re-meshing and the interpolation are time-consuming, more- 
over, the latter introduces additional numerical and non-conservativity errors. 
Further, even if the state variables are interpolated in a conservative way the 
conservative errors of some other important conservative quantities, such as 
total angular momentum, usually increase, see e.g. [2]. 

In this paper we would like to present our method that we applied succes- 
fully to real-life problems, such as the simulation of the flow in a Diesel engine 
and in a high-voltage circuit breaker (this can be considered a domain with 
several pistons and valves). With both problems we have to compute compress- 
ible (multicomponent) gas flow in a strongly deforming 3D domain where the 
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deformation is induced by translating boundary parts. In Section 2 we pose 
our mathematical model, the multicomponent compressible Euler equations 
on moving domains. Then in Section 3 we introduce a first order finite volume 
method based on Vijayasundaram’s numerical flux function with explicit time 
stepping. The core of the numerical algorithm is the moving tetrahedral mesh 
algorithm called snapper, the basic idea of which goes back to the snapper 
algorithm for hexahedral meshes given in [2] . This results in a method which 
requires re-meshing and interpolation only in the very close neighbourhood of 
the moving objects and, moreover, these are done very efficiently. 

This numerical method is coded and we show applications to a 3D (aca- 
demic) test problem and the industrial problem of gas flow in a high-voltage 
circuit breaker in Section 4. We conclude that comparisons of the computa- 
tions with the exact solution of the test problem and with actual physical 
measurements for the latter problem show good agreement. 



2 The mathematical model 

We have selected the multicomponent Euler equations of gas dynamics on mov- 
ing domains as our mathematical model of the fluid flow problems described 
in Section 1 . This model is suitable for the modelling of flows where the effect 
of viscosity is not significant comparing to that of convection. 



du 



+ divf{u) = 0 Vt G [0,tmax], X G 0{t) 

ii(0,x) == uo{x) Vx G i?(0) 

+ BC (boundary conditions) 
+ EOS (equations of states) 



( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 



where 



- u:U 



[o,tmax]{^} ^ is the state variable with u — (pi, 

pv'^,ey where pk = Pk(t,:x.) (k = p := Pm, v = v(^,x) = 

{vi , V 2 , e = e{t, x) are respectively the density of the fluid components, 
the density of the fluid, the fluid velocity and the total energy density (i.e. 
the total energy per unit volume of the fluid) ; 

- / == (/i, / 2 , fsV is the flux vector with fi(u)={piVi, . . pxVi, (pViV+pej)^, 
Vi{e+p)’^ {i = 1,2,3) where is the ith coordinate unit vector; div/(w) = 

•Gi (^Xi ’ 

— EOS denotes the set of the equations of states of a non-ideal gas mixture: 



K 



m=l 






K 

e = ^ PmIm{T) 

m=l 



( 5 ) 
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where T — T{t, x) denotes the temperature of the fluid; Im{T) is the specific 
internal energy given a priori by interpolation formulas based on tabulated 
values, R is the universal gas constant and Wm is the molecule weight of 
the mth fluid component; 

— BC: linearly consistent boundary conditions formulated by the help of ghost 
cells (for more details see e.g. [3] pp. 457-460, [11] pp. 222-224 with an 
emphasis on moving boundaries; see also [2]); 

— Q{t) is the time dependent flow domain defined in the following way: i?(0) 

is a given initial domain, which is deforming according to the mapping 
^ • [0,^max] X 1^(0) ^ i-e. Q{t) {x = ^ 1^ ^ }5 

suppose that (ft := •) is one-to-one for all t\ then the velocity of the 

points of Q is given by ^(t, x) := ^(t, (/p^^(x)) Vt C [0, tmax] and Vx € Q{t). 

Note that is not unique if we prescribe only the deformation of the boundary 
of i7, which is the case in the situations we are focussing on in this paper. 

The following lemma of the calculus called Reynolds’ transport theorem is 
the basic tool to obtain an integral formulation for the fluid flow on a moving 
domain. 

Lemma 1. Let V be a moving subdomain of Q, i.e. V{0) C i?(0) and V(t) := 
(^t(y(0)). Then for any 'll; : IR^ — ^ IR differentiable function we have 

'0(t,x)dx— J s)K(t, s) • n(t, s) ds. 

v{t) V{t) dV{t) 

Integrating (1) over a moving subdomain V we get, by applying Lemma 1, 
a weak formulation of the Euler equations on moving domains as 

udx+ J f{u)’n—K,'iiu ds = 0 VF C i? moving subdomain of 1? 

V(t) dV{t) 

( 6 ) 

or, integrating in time and using the notation 

u = uv{t) — j u{t,x)dx, 

V(t) 







\V (tb)\u{tb) - \V {ta)\u{ta) + j J f{u)-n-K-nudsdt = 0 



to dV{t) 



'^[ta,tb] c [0,imax], C f2. 



3 The numerical methods 

For the construction of a numerical method to the problem posed in Section 2 
we consider a moving tetrahedral mesh of f2(t). This means that we have 
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a face-to-face partition of O into tetrahedra Tj, j = such that each 

tetrahedron is moving according to cp, the transformation function of We 
allow that at certain time-points called snapping (or re-meshing) points of time 
the structure of the partition and even N can be changed. 

Further, we denote by Sji the Ith. side of Tj (/ = 1,...,4) according to 
a certain agreement, which does not change between the subsequent snapping 
points. Then the tetrahedron neighbouring Tj with sharing the face Sji will 
be denoted by the local indexing Tji as well. (Tji denotes a “ghost” tetrahe- 
dron if Sji C dQ.) Further, let denote the outer unit normal vector of Tj 
corresponding to the face Sji . 

For the time discretization we suppose that an adaptively defined subdi- 
vision of [0,tniax] as 0 = < ... < < . . . < = tjnax Is given, 

where each snapping point belongs to further := is the 

nth time step-size. 

Now we are in a position to introduce our numerical method. Suppose 
that n is a weak solution of the problem posed in Section 2. Applying (7) 
with V — Tj on we obtain the explicit scheme for the u'j ^ 

,rp ]. X| / M(i",x)dx values 

n= 1,2,... 

I 

( 8 ) 

where g is the numerical flux function on sides and n^/, kji^ Uji approximate 
in some sense n^/, Kji and respectively, such that 



/ / (/(n) -n-/^ -nn) dsdt T^^ji ,hju kji). (9) 

The initial values for (8) are defined by := _ \ [ no(x)dx. For 

JTjiO) 

a time-stepping scheme we have to define the moving mesh algorithm, the 
numerical fiux function g and the geometrical parameters. We devote the fol- 
lowing three subsections to the definition of these. 

3.1 The moving mesh algorithm: the snapper 

Here we give an algorithm for an efficient discretization of the moving do- 
main i? = f2{t) which is strongly deforming due to some translating parts 
of the boundary. At first we divide O into non-overlapping blocks (moving 
subdomains) according to the type of the moving/deformation and discretize 
the blocks separately, taking care of the face-to-face property on the common 
parts of the boundary of the blocks; the union of the tetrahedra of the blocks 
finally constitutes the tetrahedral mesh {Tj}. The type of a block B can be 
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fixed (if hi = 0), shifting (if A^(t,x) = /3{t)h) or deforming. We call B of the 
latter type if B is translation invariant in the sense that there exists a surface 
So C IR^ such that Bq := UtB{t), the frame of B, equals the volume swept by 
shifts of So^ i.e. Bq = C 5 i^[o,i](*^o + ^b), and, moreover, there exists a moving 
part of the boundary M C dB such that /^(t,x) = l3{t)h whenever x G M{t) 
and K • n — 0 otherwise. 

The discretization of the blocks of the first or the second type can be an 
arbitrary tetrahedral mesh fitting the geometry at time t = 0 and this is left 
unchanged in time (first type) or simply shifted by (3{t)h (second type). Let 
us now assume that B is of deforming type and for the ease of presentation 
suppose that So C dB{0) and M(0) C So (e.g. the block of a valve in its 
bottom dead center). The task we have to solve is the discretization of B 
at time points t'^ given by the flow calculation (the “hydrocode”) such that 
B{t^) and its mesh is derived by the mapping from the given B{t^~^) and 
its mesh (n = 1,2,...); this step is called mesh modification. But first we need 
the discretization of the frame Bo — B{0). 

Layered tetrahedral mesh generation The steps of the discretization of 
jBo are the following. (For an illustration see Figure 1.) 

1. Triangulate M(0) C So and extend this triangulation to that of So. 

2. Translate this triangulation in the direction of 1/ih where ^ is a positive 
integer to obtain a layer of prisms. 

3. Divide the prisms into 3 tetrahedra each to get a face-to-face tetrahedral 
mesh of this layer. 

4. Translate the last layer with its meshing with while necessary. 

Note that Step 3. is not a trivial task, for a solution see [6]. 

We remark that this algorithm was succesfully applied to the discretiza- 
tion of non-deforming and non-regular blocks either in such a way that first 
we enframed the block to be meshed into a layered mesh (in these steps we 
allowed “biased” translations), omitted tetrahedra not intersecting the block 
and dragged the boundary nodes of the union of tetrahedra to the boundary 
of the block. 

The algorithm of the mesh modification Suppose that we are given the 
mesh of B{t'^~^)^ (p and r^. We shall call the layer in the direction of b from 
the moving layer. 

1. Compute first the new position of the moving part, i.e. + Tn). 

2. If the height of the moving layer is smaller than half of the original height 

(i.e. l/^|b|) or even worse: + r^) does not belong to the moving 

layer, reduce so that in the new position of the moving part the height 
of the layer is exactly the half of the original height and take M{t^) = 

+Tn). 

3. Inherit the topological structure of the mesh of B(t^~^) to the mesh oiB{t^)] 
only update the coordinates of nodes of M{f^) according to the prescribed 
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triangulation of S 



layer of prisms 




layer of tetrahedra 




Fig. 1. Layered tetrahedral mesh generation 



mapping re-calculate the geometrical data (area of sides, etc.). In this 
case we assign the node velocities as mean velocities, for example for the 
node A on the moving part we take n{t^XA) = 

4. If the height of the moving layer is exactly half of the original, we shall call 
the neighbouring layer (in the direction of b) the new moving layer, and 
its nodes back in direction b are snapped to the corresponding nodes on 
The nodes of are snapped back to the initial position 

of the former active layer and are deactivated (i.e. signed that these points 
do not belong from now to the flow domain). Of course, the geometrical 
data have to be updated and since the tetrahedra in the new moving layer 
corresponding to the moving part are derived by joining two layers the state 
variables have to be interpolated (discussed below). 

Step 4. in the algorithm above is called the snapping step and the whole 
algorithm is the snapper (cf. [2]). For an illustration of this algorithm in 2D 
case see Figure 2. A whole 3D tetrahedral mesh has a too complex structure 
to illustrate the snapping on it, however the idea can be understood in 2D 
and on Figure 3 we present the basic blocks of a 3D mesh, before and after 
a snapping. 



deform 



snap deform 



gas 



solid 




type 2 snap 



Fig. 2. Sketch for the snapper algorithm 
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Due to the mesh generation and the mesh modification algorithms given 
above, tetrahedral meshes before and after snapping have good properties 
which enables us to do the interpolation of state variables in an effective and 
conservative way. One can observe that grids before and after snapping can 
be divided into “snapping blocks” of 2 or 3 prisms such way that these blocks 
before and after snapping are coincident. 

It is obvious that snappings must be handled in a different way if the base 
triangle of a prism of the moving layer is in the interior of (type 1 

snap, 2 prism snapping blocks) or if there are only one or two vertices of the 
base triangle on (type 2 snap, 3 prism snapping blocks). Type 1 and 

2 snapping blocks in 2D are marked in Figure 2. In Figure 3 we present the 
structure of type 1 and type 2 snapping blocks in 3D. We have to remark 
that depending on which vertices of base triangle are on there are 6 

different cases of type 2 snappings. 






Fig. 3. Corresponding snapping blocks in type 1 (left) and type 2 (right) cases 



The interpolation of the state variables can be done locally, inside the 
snapping blocks. This means interpolation between 3 new and 6 old tetrahedra 
in type 1, between 9 new and 9 old tetrahedra in type 2 snap. Since our method 
is of first order and state variables are of density nature assigned to tetrahedra. 
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these interpolation can be implemented in a natural way: state variables after 
snapping are linear combinations of pre-snapping values and the elements of 
transformation matrices can be calculated by determinig overlapping volume 
ratios of tetrahedra. For example in type 1 snap the interpolation matrix from 
3 to 6 tetrahedra is the following: (if we use an appropriate order of tetrahedra) 

/ 1/3 2/9 4/27 8/27 0 0 \ 

0 1/9 4/27 8/27 4/9 0 
\ 0 0 1/27 2/27 2/9 2/3/ 

Notice that this snapping algorithm, including the elements of the interpola- 
tor y matrices, is independent of the triangulation of the moving surface. 



3.2 The geometrical parameters for the scheme 

In order to obtain appropriate geometrical parameters, h^/, Rji and the 
discrete version of the geometric conservation law (GCL) gives us a guideline. 
For the concept and importance of the discrete GCL condition consult e.g. [4]. 
The GCL condition is derived from the fact that the constant flow u{t, x) = 
= const, is a solution of (1) and also its weak form (7), whenever a suitable 
BC is prescribed. Hence we have for all moving subdomains V and 
(c.f. (7)) 



t" / t" \ 

[\V {t^)\ — \V u*-\- j J ndsdt f{u*)— j J /^ • nds dt j -iz* = 0 

i.e., employing the identity nds == 0 , 

\V{t^)\-\V{e-^)\- [ [ K-ndsdt = 0. (10) 

dV{t) 

It is a natural requirement that the discretization of the problem should pre- 
serve this property called discrete GCL, i.e. the deformation of the mesh alone 
should not change a constant flow, which means that, for all u* — const. ^ if 
= u* for all j then u'j = zz* for all j. 

Lemma 2. If the numerical flux function g is conservative and consistent, 
i.e. for all u, v, n, n there holds g{u, v, n, n) = g{y, u, — n, n) and g{u, u, n, k) = 
f{u) ’ n — n ’ nu, respectively, then the discrete GCL holds for the method (8) 
whenever 



= 0 

I 



|t;i - ^ • n,, Vj. (ii) 

I 



and 
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Proof. Substituting ^ = u'^ = into (8) the statement follows from the 
consistency and conservatitity of the method. □ 



Lemma 3. The method (8) with a conservative and consistent numerical flux 
g and the snapper” mesh deformation respects the discrete GCL whenever 



hji = nji{ 



f-n— 1 _j_ 



2 
1 

K,. := - 



), i>ji = \Sji{ 



E 

A: node of Sji 



tn-l 

^n-1 



(12) 



Proof. The first relation of (11) follows from the fact that the left hand side 
of the equation equals the integral of the outer normal vector over the surface 
of + 1^)/2), which is the zero vector. To prove the second relation of 

(11) it is enough to check by (10) that ^ji^ji ' ^ji = fdv(t) ^^5 but 

this follows now from the actual choice of parameters and the definition of k. 

□ 



In our actual algorithms applied to problems reported in this paper we used 

(12). 

We remark that besides considering the discrete GCL condition is natural 
to hold true it is proven to guarantee first order accuracy of the scheme (8) 
provided the method is in addition accurate on fixed meshes, see [4]. 



3.3 The numerical flux function: Vijayasundaram’s function 

For the numerical fiux function g we employ Vijayasundaram’s numerical fiux 
function (see [12] and also [3], [7]), which was proven accurate in our former 
applied problems as well (see [5]). 

Lemma 4. Let u € n G IR^ he arbitrary and f defined in Section 2. 

Then we have f(u)n = C{u^n)u with C{u^n) := //(^) Moreover, 

the eigenvalues of C{u, n) are 

Ai, . . ., Xk +2 = V • n, Ax +3 = V • n + Vc, Xk -^4 =v •n-^/c 

where T := {pi/p, px/p)'^, c := d{pi,^^.,pK) + ^ 

{e + p)/p. 

The eigenvectors of C{u,n) corresponding to Ai^_|_s and A ^+4 cltc 

I ^ 

right: v ± ^/cn 

\H ± v^v • n 




In the formulas above we have, with the specific heat capacity at constant vol- 
ume Cv,m{T) = dIm{T)/dT, 
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FlPm/^^m 

de J2PmCV,m{'^)’ 



dp 
^ Pm 



Pe Im [T ) . 



Proof. The results follow from tedious computations and can be validated by 
checking the definitions. □ 

In (9) we have f{u) -n — tz-nu C{u, n, n)u with C(n, n, /^) C{u, n) — • n / 
(/ is the (K + 4)-by-(i^ + 4) identity matrix). Then we chose for g the function 

g{u,v,n,K) = 5vijaya := C{ ^ ,n,K)+M + C( ^ ,n, v. (13) 

Note that the computation of (and C~) is done by using diagonalization 
= Q~^ Q where Q can be computed from the eigenvectors of C and 
D = diag(Ai(C) — n - k,). 

It is clear that Vijayasundaram’s fiux function is conservative and consis- 
tent, therefore, as a consequence of Lemma 2 the method (8) with the snapper 
mesh deformation and geometrical parameters (12) respects the discrete GCL 
condition. 



4 Applications 

We implemented the numerical algorithm, certainly including the snapper 
mesh generation, in ANSI C programming language and applied to several 
problems. Here we display some results of two problems. Based on our expe- 
riences we may say that the code performs well for the considered test and 
real-life problems. 

4.1 Testproblem: rectangular block with an oscillating wall 

We consider a rectangular block f2(t) = [0, 1] x [0, 0.2] x [0, 0.5 — 0.32 cos(lOOt)], 
t G [0,0.25] and a one-component fiow with initial data: Vq = 0, po = 1-3, 
To = 293K] ideal gas EOS: 7 = 1.404, p = {j — l)(e — l/(2p)|pvp) 
and slip BC. For reference we computed exactly the total energy: E{t) = 

/ X 0.404 

fn(t) ~ 9759 ( 0 5 _q 3^2cL(ioot) ) components of the total mo- 

mentum: Pi(t) := f^^^jpvidx, which appeared identically 0 for z = 1,2 and 
P 3 = -0.7488 sin loot. 

We tested our method, which respects the GCL property and the explicit 
Euler method for time stepping (i.e. all geometrical data are evaluated at the 
beginning of the time step; we know that this method does not respect GCL). 
We had a mesh of 6 • (20 x 5 x 20) = 12000 tetrahedra (at most). 

We found numerically that there is a very small error in P 3 and E(t) (with 
relative error at most 10“^). Moreover, our method produced the fifth of the 
error of the explicit Euler method in P 2 , underlining the importance of the 
discrete GCL property. 
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4.2 Application: gas flow in a high voltage circuit breaker 

We investigated the gas flow in a high voltage circuit breaker. In these investi- 
gations the gas flow was induced by mechanical constraints of a configuration 
of pistons and valves only, i.e. the current was taken identically zero. The fluid 
flow domain is slightly not axisymmetric but employing its rotational symme- 
try of 90 degrees and plane symmetry it is sufficient to compute the flow in its 
eighth part. Hence we did not assume a priori the usual axisymmetric formula- 
tions (c.f. [8]). The gas was a mixture of two components that were originally 
separated. For an illustration of our results see Figure 4. Our code performed 
very well: the computed and measured pressure and density values were com- 
pared at two control points and there were at most 5% relative errors in the 
measured and computed quantities. However, at certain small parts of the fluid 
domain we experienced spurious pressure oscillations but these occured only 
for a short period during the simulation time. This was somehow expected 
(see e.g. [1], [9]), and clipping negative values from the energy approximations 
cured the code. Finally we remark that the flow was proven significantly not 
axisymmetric (c.f. Figure 4), which justifies our model. 




Fig. 4. Graph of Yi := p\j p at two points of time in two perpendicular plane sections 
(the plane of the cross section and the direction of movement of parts is marked) 
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Summary. Interior penalty discontinous Galerkin methods for the time-harmonic 
Maxwell equations in frequency- domain, together with their stability and conver- 
gence properties, are reviewed. A new set of numerical tests carried out on a model 
problem with a singular analytical solution validates the theoretical error estimates 
of the presented method for the high-frequency case. 



1 Introduction 

In this paper, we review recent work on discontinuous Galerkin (DG) methods 
for the discretization of the time-harmonic Maxwell equations: find the electric 
field u such that 

V X V X u) — — j in i7, (1) 

nxu = 0 on r = dQ. (2) 

Here, i? is a simply-connected Lipschitz polyhedron in with connected 
boundary F — dQ and outward normal unit vector n. The function j is a given 
source term in The temporal frequency is denoted hj uj > 0. The 

real- valued functions /i, e and a are the magnetic permeability, the electric 
permittivity, and the electric conductivity, respectively. 

The main motivation for using a DG approach for the numerical approxi- 
mation of the above problem is that DG methods, being based on discontin- 
uous finite element spaces, can easily handle meshes with hanging nodes and 
local spaces of different orders. This renders DG methods ideally suited for hp- 
adaptive algorithms. Moreover, the implementation of discontinuous elements 
can be based on standard shape functions; a convenience that is particularly 
advantageous for high-order elements and that is not straightforwardly shared 
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by standard edge elements commonly used in computational electromagnetics 
(see [1,3, 14] and the references therein for /ip-adaptive edge element methods). 
On the other hand, in the /ip-context, since most of the degrees of freedom are 
in the interior of the elements, the increase in the total number of degrees of 
freedom with respect to the corresponding conforming method is not dramatic. 

In this paper, we focus on interior penalty DG discretizations for (l)-(2) in 
the low-frequency and high-frequency cases (the term in (1) is neglected 
in the former case, whereas the term iuoa is neglected in the latter). 

In the low-frequency case, problem (l)-(2) has to be completed by a diver- 
gence-free constraint in the subdomain i?o C i? covered by insulating material 
where <7 = 0 (additional scalar constraints arise if DQq is not connected, see, 
e.g., [13]). This results in the following system: 

Vx (/i~^Vxu)+ia;cr u = j ini?, V-(6:u) — 0 in i?o, nxu = 0 on T. 

( 3 ) 

The main difficulty here is the incorporation of the divergence constraint in 
the DG framework. Following [8], we show that this can be achieved using a 
mixed approach where the constraint is accounted for by a suitable Lagrange 
multiplier. We present the underlying theoretical properties, as well as the 
energy norm a priori and a posteriori estimates that were derived in [8] and [6] . 
For further numerical tests, we also refer to [9]. 

In the high-frequency case, the problem that consists in finding the electric 
field u such that 



V X (/i ^ V X u) — uj'^eu = j in i?, n x u = 0 on F. (4) 

Here, we assume that is not an eigenvalue of the underlying Maxwell eigen- 
problem. While the design of interior penalty DG methods is straightforward, 
the key difficulty for (4) arises in the numerical analysis of the methods due to 
the indefiniteness caused by the zero order term. We present the main results 
of a novel error analysis that was recently developed in [4] . These results show 
that DG methods for (4) yield optimal rates of convergence in the energy norm 
and the L^-norm. We further present a new set of numerical results on a test 
problem with a singular solution. 

To simplify the presentation in this article, we assume that p and e are 
constants. However, the analysis in [8] covers the case of piecewise smooth 
material coefficients, whereas the theoretical results in [4] hold for smooth 
coefficients p and e only. 



2 Discontinuous Galerkin Discretizations 

In this section, we introduce interior penalty DG methods for the two model 
problems in (3) and (4), and review their theoretical properties. 
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2.1 Meshes, Trace Operators, Finite Element Spaces and DG 
Norms 

We consider shape-regular affine meshes Th that partition the domain Q into 
tetrahedra {K}] the parameter h denotes the mesh size of Th given hy h = 
where hx is the diameter of the element K e Th. We denote 
by the set of all interior faces of elements in 7^, by the set of all 
boundary faces, and set Th '.= T^ U T^. We define the local meshsize h on Th 
by setting h(x) := ma,x{hx+, if x is in the interior of dK^ D dK~ ^ and 

by h(x) := hx if x G OK is on the boundary. 

For piecewise smooth vector- valued and scalar- valued functions v and q, 
respectively, we introduce the following trace operators. On an interior face 
/ G T^ shared by two neighboring elements and K~ with unit outward 
normal vectors , respectively, denoting by and the traces of v and q 
taken from within respectively, we define the jumps and averages across 
/ by |vJt := n+ x v+ + n~ x v“, {qj^ = g+n++g~n-, — (v++v“)/2 

and -glgj {q~^ + respectively. On a boundary face / G T^, we set 

[v]t := n X V, v and {qjx = gn. 

For a given partition Th oi O and an approximation order ^ > 1, we intro- 
duce the following discontinuous finite element spaces: 

:= {v G L\nf : v\k G V^Kf e Tn), 

Qh = {qe L\nf : q\K € P^+\K) MK G %}, 

where V^{K) denotes the space of polynomials of total degree at most k on 
K. 

Finally, denoting by || • ||s,d, foi* <5 > 0 and D a bounded domain in 
or E^, the standard norm in the Sobolev space d > 1, we define the 

following DG norms with which we will measure the approximation errors: 

l|v|| V) = ll^^^llo,^ + X 

2.2 DG Discretization of the Low-Frequency Problem (Insulating 
Materials) 

We consider the case of insulating materials, i.e., Qq — since all the key 
difficulties in the numerical treatment of (3) are already present in this par- 
ticular case. The DG method for the discretization of (3) with Qq = Q and 
divergence-free source term j is based on the following mixed formulation of 
the problem: 

V X V X u) — eVp = } in i7, 

V • (eu) == 0 in f?, (6) 

n X u = 0, p = 0 on F. 
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Here, p is the Lagrange multiplier related to the divergence constraint. The 
standard variational formulation of (6) is well-posed in iJo(curl; i?) x 
see, e.g., [3,14]. 

The mixed DG method for (6) then reads as follows: find (uh,Ph) in x Qh 
such that 



ah{uh,v) 4- bh{v,Ph) = (j,v), 
bh{uh,q) - Ch{ph,q) = 0 



for all (v,g) G Yh x Qh, where the discrete forms •), •) and •) 

are defined, respectively, by 

a/^(u,v) = X u, X v) - / |u]t • x ds 

- [ [v]r • X f a/x“^|u]T • |v]Td5, 

Jth 

bh{y,p) = -(^v, Vhp) + / lev} • IpIn ds, 

Jth 

Ch{p,q)=^ / c^|p]iv • Mivds. 

Jth 



Here, and in the following, we denote by (•, •) the standard inner product 
in d > 1, and use V/i to denote the elementwise application of the 

operator V. Further, we use the notation pds ;= YlfeJ^hf/ ^ds. The 
form •) corresponds to the interior penalty discretization of the curl-curl 
operator, the form bh{', •) discretizes the divergence operator in a DG fashion, 
and the form is the interior penalty form that weakly enforces the 

continuity of 

The parameters a and c in are the usual interior penalty stabiliza- 

tion functions defined by 



a ah c 7b (8) 

where a and 7 are positive parameters independent of the mesh size. 

The results contained in the following theorem have been proven and nu- 
merically validated in [8] . 

Theorem 1. There is a parameter amin > 0 only depending on the shape 
regularity of the mesh and the polynomial approximation degree i such that, 
for parameters a > amin ond 7 > 0 in (8), the DG method (7) possesses 
a unique solution. 

Moreover, assume that the analytical solution (u,p) of (6) satisfies the 
smoothness assumptions e\i G , /i“^Vxu G ondp G 

for an exponent s > 1/2, and let (u/i,p/j,) be the DG approximation on con- 
forming meshes defined by (7), with a > amin o,nd 7 > 0. Then we have the 
optimal a priori error bound 
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||u-U^||v(/i) + \\p-Ph\\Q(h) 

< [||£U||,,^, + ll/i-W X u|U.i2 + \\p\\s+l,o] , 

with a constant C independent of the mesh size. 

The analysis developed in [8] is valid for piecewise smooth coefficients fa 
and £ and the error estimate in Theorem 1 holds for piecewise smooth solutions. 
On the other hand, it requires the assumption of conformity of the meshes, 
although numerical tests in [9] have shown that the method is robust on meshes 
with hanging nodes as well. 

Moreover, the following energy norm a posteriori error estimate has been 
established and tested in [6]. 

Theorem 2. We assume that V • j = 0 holds, so that p = 0. Let (uh^Ph) 
the DG approximation on conforming meshes defined by (J), with a > Omin 
and 7 > 0. Then there is a constant C > 0 independent of the mesh size, such 
that 

1/2 

||u - u,i||v(fe) + lb - PhWqih) , 

Ken 

where the elemental error indicator rjK is given by 
Vk = llj - V X X yih)^sVph\\l^K^^^K~^\\^K{^h^ 

|u/iJr||o,ai^ + hK\\l^^hlN\\o^dK\r + • (^^^)IIo,k 

IPhjNWo^dK^ 

a G (1/2,1] is the parameter of the embeddings jFfo(curl; i?) n i7(div;i?) ^ 
i7(curl; i?) niJo(div;i?) ^ (see [2]), and tk{v) is the 

numerical flux defined by 

T M = ^ V X v} - jU-^a |v1t) on dK \ F, 

^ \ UK X XV — p~^a. {uk X v)) on dK fl F. 

2.3 DG Discretization of the High-Frequency Problem 

For problem (4), the interior penalty DG method is given by: find u/i G 
such that 

ah{uh, v) - w'^{euh, v) = (j, v) (9) 

for all V G V^, where the discrete form •) is the same as in Section 2.2. 
The interior penalty stabilization function a G L^{Fh) is defined again by (8), 
with a chosen independently of the mesh size and the frequency. 

The following a priori error estimates in the energy norm and in the 
norm have been proven in [4]. Their proof is based on techniques similar to 
those of [12] and [11, Section 7.2], for the energy error bound, and of [10, 
Theorem 3.2], for the L^-error bound, combined with novel results that allow 
one to approximate a discontinuous function by a conforming one. This result 
is instrumental in controlling the non-conformity of the DG method. 
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Theorem 3. Assume that the analytical solution u o/(4) satisfies the regular- 
ity assumptions £u € and x u G , for s > ^, and let Uh 

he the DG approximation on conforming meshes defined by (9). Then there is 
a parameter Omin > 0 only depending on the shape regularity of the mesh and 
on the polynomial approximation degree i, and a mesh size ho > 0 such that, 
for a > Omin ond 0 < h < ho, we have the optimal a priori error bound 

||u - u;.||vw < [||£u||,,n + ll/X-ly X u|U,^] , 

with a constant C > 0 independent of the mesh size. 

Consequently, for a > Omin; the DG method (9) admits a unique solution 

G provided that h < ho. 

Finally, assume that the analytical solution u of (4) satisfies the addi- 
tional regularity assumption u G for s > ^ and a G (1/2,1] 

the parameter of the embeddings Ho{cml;f2) D i/(div; i?) ^ and 

(curl; 12) H iJo(div;12) ^ iJ^(i?)^ {see [2]). Then, for a > amin; there is 
a mesh size h 2 > 0 such that, for 0 < h < h 2 , we have 



||£i(u - Hh)\\o,n < C [||£u||,+,,^2 + ||/x“'V x 



U 



s,i7 



with a constant C > 0 independent of the mesh size. 

The analysis of [4] is based on duality arguments; thus, the result of Theo- 
rem 3 can easily be extended to smooth material coefficients /a and e. However, 
the extension to piecewise smooth coefficients requires alternative mathemat- 
ical tools; this is the subject of ongoing research. 



3 Numerical Example 

In this section we present a numerical example to highlight the practical 
performance of the DG method introduced and analyzed in this article for 
the numerical approximation of the high-frequency indefinite time-harmonic 
Maxwell equations in (4), considering a model problem with a singular solu- 
tion. Throughout this section, we take /a = /xo and e = so, the permeability 
and permittivity of the free space, respectively, and select the interior penalty 
parameter a in (8) as follows: a = 10^^. As is standard in electromagnetic 
computations, we scale the electric field by u — > jxou and obtain a problem for 
the scaled field (that we again denote by u) of the form 

V X V X u — k‘^u = j in 17, n x u = 0 on T, (10) 

with a rescaled right-hand side (again denoted by j) and the wave number 
k = u^po^o- 

For simplicity, we restrict ourselves to the two-dimensional analogue of (10). 
To this end, we let 17 be the L-shaped domain (—1, 1)^ \ [0, 1) x (—1,0] and 
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select j (and suitable non-homogeneous boundary conditions for u) so that the 
analytical solution u to the two-dimensional analogue of (10) is given, in terms 
of the polar coordinates (r, ^), by 

u(x, y) = V5(r, 'i^), where S{r,'d) = sin{2'd/ 3). (11) 

Here, the boundary conditions are enforced in the usual DG manner by adding 
boundary terms in the formulation (9); see [5,7] for details. The analytical 
solution given by (11) then contains a singularity at the re-entrant corner 
located at the origin of i?; in particular, we note that u lies in the Sobolev 
space s > 0. This example represents a slight modification of the 

numerical experiment presented in [4]; cf., also, [1]. 

We investigate the asymptotic convergence of the DG method on a sequence 
of successively finer (quasi-uniform) unstructured triangular meshes for i = 
1, 2, 3 as the wave number k increases. To this end, in Tables 1, 2, 3 and 4 we 
present numerical experiments for k = 1,2, 4, 6, respectively. In each case we 
show the number of elements in the computational mesh, the corresponding 
DG-norm of the error ||u — u/i||v(/i) and the numerical rate of convergence r. In 
view of the scaling we introduced, we have taken ||u— u^||v(/i) as (||u— u/i||q 

||V X (u — u^)||o^^ + ||h”^ |u — u/iJt1Io,:t^) ^ • We observe that (asymptotically) 
II u — u/i||v(/i) converges to zero at the optimal rate (9(/i^/^~^), for each fixed £ 
and each k, a>s h tends to zero, as predicted by Theorem 3. In particular, we 
make two key observations: firstly, we note that for a given fixed mesh and 
fixed polynomial degree, an increase in the wave number k leads to an increase 
in the DG-norm of the error in the approximation to u As pointed out in [1], 
where curl-conforming finite element methods were employed for the numerical 
approximation of (10), the pre- asymptotic region increases as k increases; this 
is particularly evident when k — 6, cf. Table 4. Secondly, we observe that 
the DG-norm of the error decreases when either the mesh is refined, or the 
polynomial degree is increased as we would expect; this is also the case when 
the DG-norm of the error is compared with the total number of degrees of 
freedom employed in the underlying finite element space, for each fixed k; for 
brevity these results have been omitted. 

Finally, we end this section by considering the rate of convergence of the 
error in the approximation to u measured in the L^-norm. While for smooth 
solutions the optimal L^-order has been confirmed numerically (see [4]), the 
additional regularity assumptions for the estimate in Theorem 3 do not 
hold in the example considered here. Notwithstanding this, in Figure 1 we 
plot the L^-norm of the error in the approximation to u, with the square root 
of the number of degrees of freedom in the finite element space V/^, for A; = 1 
and k = 6. We observe that (asymptotically) ||u — u/i||o,i? converges to zero at 
the rate (9(/i^/^), for each fixed i and /c, as in the case of the DG-norm of the 



error. 
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Table 1. Convergence of ||u — u^||v(^) with k — 1. 
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II 

to 








Elements || 


U- Uh||v(h) 
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||u-Uft||v(h) 
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l|u- Ufcllv(h) 


r 


24 


1.525e-l 
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8.881e-2 


- 


6.078e-2 


- 


96 


8.875e-2 


0.78 


5.374e-2 


0.73 


3.744e-2 


0.70 


384 


5.393e-2 


0.72 


3.331e-2 


0.69 


2.337e-2 


0.68 


1536 


3.348e-2 


0.69 


2.085e-2 


0.68 


1.467e-2 


0.67 


6144 


2.096e-2 


0.68 


1.310e-2 


0.67 


9.227e-3 


0.67 



Table 2. Convergence of ||u — u/i||v(^) with k = 2. 
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0.67 


2.326e-2 


0.67 


6144 
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0.67 


1.464e-2 


0.67 



Table 3. Convergence of [ju — u^||v(^) with k = 4. 
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4 Conclusions 

In this paper, we have reviewed two interior penalty discontinuous Galerkin 
methods for the numerical approximation of the time-harmonic Maxwell equa- 
tions in both the low-frequency and high-frequency regimes. The predicted per- 
formance of the DG method for the indefinite problem in the high-frequency 
case has been confirmed in a new set of numerical experiments carried out on 
a model problem with a singular solution. 
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Table 4. Convergence of ||u — u^||v(h) with k = 6. 
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^Degrees of Freedom 



Fig. 1. Convergence of ||u — Uh||o,r? for /c = 1 and k = 6 
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Summary. We consider mixed /ip-discontinuous Galerkin finite element methods 
(DGFEM) for Stokes fiow in general polygons. In particular, we show that, on ge- 
ometrically refined meshes, the /ip-DGFEM yields exponential rates of convergence 
for problems with piecewise analytic input data. Numerical results confirming the 
exponential convergence rates are presented. 



1 Introduction 

Over the last few years, several mixed discontinuous Galerkin finite element 
methods (DGFEM) have been proposed for the discretization of incompress- 
ible fluid flow problems; see, e.g., [4, 6, 7, 9, 12, 16] and the references therein. 
The main motivations that led to these schemes are that mixed DGFEM pro- 
vide robust and high-order accurate approximations, particularly in transport- 
dominated regimes, and that they are considerably flexible in the choice of 
velocity-pressure combinations, without excessive numerical stabilization. For 
example, no extra stabilization is required to use optimally matched combi- 
nations where the approximation degree for the pressure is of one order lower 
than that of the velocity; this result was first established in [11] in the context 
of linear elasticity of nearly incompressible materials. 

The work in [13] presented a unifying framework for the analysis of mixed 
/ip-DGFEM for Stokes flow. For discontinuous elements on hexa- 

hedral meshes, the results there ensure (slightly suboptimal) error bounds for 
the p- version of the DGFEM, where convergence is obtained by increasing the 
polynomial approximation order p on a fixed mesh. However, these bounds 
result in algebraic rates of convergence and are restricted to piecewise smooth 
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solutions - an assumption that is unrealistic in general polygons, due to the 
presence of corner singularities. 

In this note, we report on the recent results in [14] that extend the approach 
of [13] to mixed /ip-DGFEM for Stokes flow in general polygons. In particular, 
we show that, on geometrically reflned meshes, the /ip-DGFEM yields expo- 
nential rates of convergence for problems with piecewise analytic input data; 
see [18] for similar results in the context of diffusion problems. We further 
present numerical results that confirm the exponential convergence rates as 
predicted in [14]. 

2 The Stokes problem in polygons 

In this section, we introduce the Stokes problem and use the results of [10] to 
describe the regularity of its solution for piecewise analytic input data. 

2.1 The Stokes problem 

Let f? C be a bounded polygonal domain with outward unit normal vector 
n on the boundary dQ. Then, for a given forcing term f G and 

a Dirichlet datum g G if 2 satisfying the compatibility condition g • 

nds = 0, the Stokes problem is to find a velocity field u G if^(i7)^ and 
a pressure p G Lq{Q) := such that 

—Au + Vp = f in 17, 

V • u = 0 in 17, (1) 

u = g on df2. 

This system is uniquely solvable; see, e.g., [5, 8] for details. 

2.2 Analytic regularity in polygons 

In [10] the regularity of the solution (u,p) to the Stokes equations with piece- 
wise analytic data f and g was described in terms of the countably normed 
Sobolev spaces introduced by Babuska and Guo for closely related diffusion 
and elasticity problems (see [2, 3, 15], and the references cited therein). To 
define these spaces, let denote the vertices of the domain 17. To each 

vertex Ai we assign a weight > 0 and store these numbers in the M-tuple 
P — (A? • • • 5 Pm)‘ We define /? ± j := {/3i ± f, . . . , (Sm i j) and use the short- 
hand notation C\ > (3 > C 2 to mean Ci > j3i> C 2 for z == 1 , . . . , M. Writing 
r^(x) = min{l, |x — A^|}, we define the weight function ^/^(x) := 
and introduce the semi-norms 

k 

E k>l>0. 

" l«l>i 
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We denote by the completion of with respect to the norm 

k 

| a |=0 

Here, we denote by || • l|L 2 (i 7 ) the usual L^-norm. Similarly, || • \\H^(^f2) is the 
norm on the standard Sobolev space H\Q). 

Definition 1. For an M-tuple j3 = I > 0, the countably 

normed space consists of all functions u for which u G for 

k > I and 

< CS'^-‘\k - l)\, \a\ = k> I, 

for constants C > 0, d > 1 independent of k. Moreover, for I > 1, the space 
^ {df]) is the space of traces of functions in 

Functions in Bj^{f2) (or their traces) are referred to as piecewise analytic 
functions. Indeed, they are analytic in any interior domain i?int C i? with 
i?int C but develop singularities at the corners Globally, 

we have Bj^{0) C but Bj^{f2) (fi H\Q). The following regularity 

result was proved in [10]. 

Theorem 1. There exists a weight vector 0 < /^min < 1 depending on the 
opening angles of Q at the vertices {Ai}ff-^ such that for weight vectors (3 with 

3 

/^min < /3 < 1 and piecewise analytic data (f, g) G B^{f2)‘^ x B^{dn)‘^, the 
solution (u,p) of the Stokes system (1) satisfies (u,p) G x 

3 

In the rest of the paper, we assume that (f , g) G Bp{f2)^ x B^ for a 

weight vector /3 with /^min < /^ < 1 , in order to ensure the piecewise analyticity 
of the solution (u,p), as stated in Theorem 1. 



3 Discontinuous Galerkin methods 

In this section, we introduce discontinuous Galerkin methods for the Stokes 
problem (1) and review their well-posedness, using the recent results in [13]. 

3.1 Meshes 

Throughout, we assume that the domain Q can be subdivided into shape- 
regular affine meshes Th — {K} consisting of parallelograms K. For each K G 
Th, we denote by uk the outward unit normal vector to the boundary dK, 
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and by Hk the elemental diameter. Further, we assign to each element K E Th 
an approximation order The local quantities and kx are stored in 

the vectors h == {hK}KeTh k = {kK}KeTh^ respectively. An interior edge 
of Th is the (non-empty) one-dimensional interior of dK^ 0 dK~ , where 
and K~ are two adjacent elements of 7^. Similarly, a boundary edge of Th is 
the (non-empty) one-dimensional interior of dK fl dQ which consists of entire 
edges of dK. We denote by Sj the union of all interior edges of 7^, by the 
union of all boundary edges, and set 8 — SjU Ex>- We allow for meshes with 
1-irregular hanging nodes, and further assume that there is a constant k, > 0 
such that Kkx < kx> < whenever K and K' share a common edge. 



3.2 Averages and jumps 

Next, we define average and jump operators. To this end, let and K~ 
be two adjacent elements of Th and x be an arbitrary point on the interior 
edge e = dK~^ D dK~ C Sj. Moreover, let g, v, and r be scalar-, vector-, and 
matrix- valued functions, respectively, that are smooth inside each element K^. 
By (^^,v^,r^) we denote the traces of {q,v,r) on e taken from within the 
interior of respectively. Then, we define the averages at x G e by: 

+ 9“)/2, w = (v+ + v“)/2, -fr} = (r+ + r-)/2. 

Similarly, the jumps at x G e are given by 

[gj = q+ nx+ + q~ n^- , [v] = v+ • 11^+ + v“ • iij^- , 

|v] = V'' (g> n^+ + v“ ® n^- , [r] = r+nx+ + iT^k- ■ 

On boundary edges e C we set — g, = v, = r, as well as 
[q] =qn, |v] = v • n, = v 0 n, and [r] = rn. 

3.3 Mixed /ip-DGFEM 

Given a mesh Th and a degree vector k == {kx}^ kx > 1, we wish to approxi- 
mate the Stokes problem (1) by finite element functions (uh^Ph) G V/^ x 
where 



Vh = {veL^{^2f■.v\KeQ'^>^{K)^ K &%}, 

Qh = {qe Llin) : q\K € KeTh}. 

Here, Q^{K) is the space of polynomials of degree at most k in each variable 
on K. Thereby, we consider the following mixed method: find (uh,Ph) G V/i x 
Qh such that 



Ah{uh,y) + Bh{y,Ph) = Fh{v), 

-Bh{uh,q) = Gh{q) 



( 2 ) 
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for all (v, g) The forms Ah and Bh are discontinuous Galerkin forms 

that discretize the Laplacian and the incompressibility constraint, respectively, 
with corresponding right-hand sides Fh and Gh> These forms are given by 



A(u,v) 



Bh{v,q) 

Fh{v) 

Gh{q) 



j^Vh^-'^hydy.- VhvJ : H { V/,u J ; |y]) ds 

+ c|ul : Ivjds, 

- f qVh-vdx+ [ iq}lv]ds, 

JQ JS 

/ f ’Vdx— (g (g) n) : V/iVd5 + / cg-vds, 

J J J 



L 



qg' nds. 



Here, S/h and Vh’ denote the discrete gradient and divergence operator, re- 
spectively, taken element-wise. The function c G L°^{S) is the so-called dis- 
continuity stabilization function that is chosen as follows. Define the functions 
h G L^{S) and k G L^{S) by 



h(x) := 



mm{hK,hK'}, 

hK, 



X € e = dK n dK' c £x, 
X e e = dK n dQ C 



max{kK,kK'}, 



xee = dKr\ dK' c Sx, 
X G e = dK n dQ c Sx>. 



k(x) :: 

Then we set 

with a parameter 7 > 0 that is independent of h and k 



c = yh ^k^. 



Remark 1. The form Ah corresponds to the so-called symmetric interior penalty 
(IP) discretization of the Laplace operator; see [1] and [13] where the presen- 
tation and analysis of several different DG methods were unified for diffusion 
problems and the Stokes system, respectively. All the results presented in this 
paper hold true verbatim for all the mixed DG methods investigated in [13]. 

Remark 2, For piecewise analytic data f and g as in Theorem 1, the forms Fh 
and Gh are well-defined. 



3.4 Well-posedness 

Well-posedness of the discrete system (2) was established in [13]. Indeed, by 
endowing with the broken norm 

l|v||^ := II Vfcv||| 2 (f 2 ) + ^ h- V||v ]|2 ds, 
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the forms Ah and Bh are continuous on x and x Qh^ respectively, 
with continuity constants C > 0 independent of h and k. Furthermore, there 
exists a parameter 7min > 0 independent of h and k such that for any 7 > 7min 
there exists a coercivity constant C > 0 independent of h and k with 

^/»(v,v) > C||v||| VveV^. 

Finally, the following discrete inf-sup condition holds true: 

inf sup > 0, 

with a constant C > 0 that is independent of h and k. Here, |k| := 
maxKeThikK}- 

The above properties of the forms Ah and Bh^ combined with the continuity 
of the forms Fh and Gh, guarantee the well-posedness of the formulation (2) 
for 7 > 7min- 

4 Exponential rates of convergence 

In this section, we present the main result of [14], namely exponential rates of 
convergence for mixed /ip-DGFEM on geometrically refined meshes. 

4.1 Geometric meshes 

We first define geometric meshes on Q = (0, 1)^. 

Definition 2. Fix n G No and a G (0,1). On Q, the geometric mesh An, a 
with n -f 1 layers and grading factor a is created recursively as follows: If 
n = 0, zAo,o- == {Q}- Given An, a- for n > 0, is generated by subdividing 

the square K with 0 G K into four smaller rectangles by dividing its sides in 
a a : {1 — a) ratio. 

An example of a geometric mesh An, a on Q is shown in Figure 1. We denote 
the elements in the basic geometric mesh by {Kij}, as indicated. We say that 
the elements Kij., K 2 j and Ksj constitute layer j for j > 2. The element at 
the origin is denoted as Ku. 

Definition 3. A geometric mesh Tn,a on the polygon Q G is obtained by 
mapping the geometric meshes An,a on Q affinely to a vicinity of each convex 
corner of Q. At reentrant corners three suitably scaled copies of An, a are used 
{as shown in Figure 2). The remainder of i? is subdivided with a fixed affine 
and quasi-uniform partition. 

In Figure 2 this local geometric refinement is illustrated. For ease of ex- 
position, we only consider mesh patches that are identically refined with the 
same parameters a and n, although different grading factors and numbers of 
layers may be used for the different corner patches. 
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O 1 

Fig. 1. The basic geometric mesh An, a 
with n = 3 and a = 0.5 




Fig. 2. Local geometric refinement to- 
wards the vertices of Q. In all corners, 
n = 3 and a = 0.5 



Definition 4. A polynomial degree distribution k on a geometric mesh Tn,a 'Is 
called linear with slope p > 0 if the elemental polynomial degrees are layerwise 
constant in the geometric patches and given by kj := max(2, [pj\) in layer j, 
j = 1, . . . , n + 1. In the interior of the domain the elemental polynomial degree 
is set to max(2, [p{n + 1)J). 

4.2 Exponential convergence 

Our main result is the exponential convergence of the mixed /ip-DGFEM for 
problems with piecewise analytic data. Its detailed proof can be found in [14]. 

Theorem 2. Assume that the analytical solution (u,p) of the Stokes problem 
( 1 ) is piecewise analytic as stated in Theorem 1. Further, let {uh^Ph) ^ ^h'xQh 
denote the DGFEM approximation defined in (2) with 7 > 7min obtained on 
geometric meshes Tn,a- Then there exists a parameter po = /io(cr, /?) > 0, such 
that for linear degree vectors k with slope p > po, there holds 

||u-u^|l/,+ ||p-p/j||i,2(f2) < Cexp(- 6 AT'/"), 

with constants C,b > 0 independent of N = dim(V/i) dim(Q^). 

Remark 3. If the polynomial degree is chosen to be constant throughout the 
mesh, i.e., kx = k for all K exponential convergence is still obtained by 
choosing k proportional to the number of layers n. 



5 Numerical experiment 

The goal of this section is to numerically confirm the exponential convergence 
result stated in Theorem 2. To this end, let i? be the L-shaped domain shown 
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in Figure 3. As in [17, p. 113], we select the right-hand side f = 0 and take the 
Dirichlet boundary datum g in such a way that, in the polar coordinates (r, (^), 
the exact solution (u,p) to the Stokes problem (1) is given by 

n(r \ 

^ {(f) - (1 -}- A) cos{(f)^l(p) J ’ 

p = -r^-i[(l + A) V(v.) + !f'"(¥^)]/(l - A), 

where 

^{(f) — sin((l + \)(p) cos(Ac<;) /(I + A) — cos((l + \)ip) 

— sin((l — \)ip) cos(Acj)/(l — A) -[- cos((l — A)(^). 

Furthermore, a; — 37 t/2, and the exponent A is the smallest positive solution of 
the equation sin(Ao;) + Asin(o;) = 0; thereby, A 0.54448373678246. 

We emphasize that the solution (u,p) is in fact the strongest corner sin- 
gularity of the Stokes operator in the domain J7; it is piecewise analytic, that 
is analytic in i? \ {O}, but both Vu and p are strongly singular at the ori- 
gin. Indeed, here u ^ and p ^ thus, this example reflects the 

typical (singular) behavior that solutions of the Stokes problem exhibit in the 
vicinity of reentrant corners. 





Fig. 3. L-shaped domain Q Fig. 4. Performance of the mixed hp-DGFEM 



Figure 4 shows the performance of the mixed /ip-DGFEM for the above 
problem, on meshes that are geometrically reflned towards the origin and for 
polynomial degree distributions that are linearly increasing away from the 
origin. In our computations we used the grading factor a = 0.5 and the linear 
slope = 1. The interior penalty parameter 7 was chosen as 7 == 10. The 
exponential convergence rate, according to Theorem 2, is clearly visible. In 
addition, we observe that the asymptotic regime is already achieved even with 
a small number of degrees of freedom. 
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Summary. The aim is to initialize a branch of periodic orbits emanating from 
a Hopf bifurcation point. The second-order predictor of the branch is developed. 
The problem is discussed in the context of the MATLAB toolbox CL_MATC0NT. 



1 Introduction 

We consider dynamical system 

u{t)=F{u{t),X), (1) 

where F : x ^ is a sufficiently smooth mapping; it G is a vector 

of state variables and A is a real parameter. Let {uh^ ^h) be a Hopf bifurcation 
point. Under generic assumptions, see e.g. [9], Theorem 3.3, there is a branch 
of periodic orbits (cycles) that emanate from this bifurcation point. One can 
decide about stability of the cycles provided that the first Lyapunov coefficient 
l\ is nonzero. 

The packages [4], [10] or [1] can detect a Hopf bifurcation point (uht^h) 
and continue the mentioned branch of cycles. 

In order to compute T-periodic solutions of (1), one usually rescals time 
and looks for 1-periodic solutions to 

+TF('u(t),A) = 0, 0<t<l, (2) 

^x(O) -'u(l) - 0. (3) 

T is the period of the actual motion and cj = ^ is its frequency. The quoted 
packages use orthogonal collocations^ see [9], to discretize the problem (2) & 
(3). 

The continuation is initialized by a cycle with a small amplitude /i, see e.g. 
[1], function init_H_LC: 

u^{t) = uh (exp (27ri t) ^h) , 0 < ^ < 1 , (4) 

where ^ and ujh > 0 satisfy 



Fu{uh,^h)^h = , \\^h\\ = 1 • 



( 5 ) 
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It reflects the well known fact that the spectrum of the differential Fu{uh, ^h) 
contains a purely imaginary eigenpair. The tuple (u^{t),T = = is 

considered to be an approximate solution to (2) & (3). 

As the predictor for the continuation, it is natural to take 

v^{t) = 3? (exp (27TZ t) ^h) , 0 < t < 1 , (6) 

namely, the differential of function with respect to h. Following this strategy, 
differentials of the components T and A with respect to h are zeroes. Therefore, 
the predictor for the pathfollowing for the first cycle reads as (u^(t), 0 , 0 ). 

The aim of this contribution is to suggest a more sophisticated cycle ini- 
tialization then just (4) and ( 6 ). The technique is based on Lyapunov- Schmidt 
reduction at Hopf bifurcation point Xh)^ see e.g. [ 11 ], [5]. We consider the 
version, which is characterized by making use of bordering operators^ see [ 8 ] 
and [13]. 

In Section 2 , we review the technique. In Section 3, we give the 2 nd-order 
formula to initiate the branch. Finally, in Section 4, we will report on a nu- 
merical experiment. 



2 Preliminaries 

We review the main points of [8]. Instead of solving (2)&(3) we seek for 27 t- 
periodic solutions of ( 1 ); the reasons are just historical. The problem is formu- 
lated as functional equation ^ (u. A, cj) =0 on proper function spaces: 

O'jr 

^ (r^. A, o;) = — cju -j- F(u, A) , T = — , (7) 

CO 

L>:UxAxR^V, U = C\S\W^) , V = C\S\W^) , ( 8 ) 

where = R/27 tZ. The roots of ^ are, locally and up to a phase shift, one- 
to-one with the roots of a scalar bifurcation equation 

(j){x,X- Xh,co - loh) = 0 ; (9) 

X G M is a reduced state variable u £U. The root 

(x — 0 , A — Xf{ = O^co — LOfj = 0 ) 

of (9) is related to the root (uij, Xh^ojh) of The nontrivial solutions x 7 ^ 0 
of (9) are linked with cycles, the trivial solutions x = 0 are linked with steady 
states. 

It is natural to develop the periodic solutions u = uh + v E U to Fourier 
series 

+00 

u = UH + [w]o + ^ , 

k=l 



( 10 ) 
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where [v]k : U is the k-th. Fourier coefficient of v eU. 

The reduction procedure depends on a choice of two vectors M G 
and L The requirement is that the bordered matrix 

^Fu{uhAh) - 

is regular. This is generically satisfied. 

We will consider two particular variants: The classical Lyapunov- Schmidt 
reduction, see e.g. [5], which corresponds to the choice 

^ = Ch , L = , (12) 

where is the unit eigenvector, see (5), and t]h is an adjoint eigenvector, 

FJ {uh, ^h)vh = , (13) 

with the scaling — 1- 

As the second variant, we just set 

M = ^H , L = . (14) 

According to [13], Remark 4, the option (14) leads to an alternative formula 
for the first Lyapunov index li. 

Actually, the periodic orbits are linked with solutions (x, A — Aij,a? — coh) 
of the factored bifurcation equation 

x~^(f) (x, X — — ljh) — • (15) 

We consider the truncated Taylor expansion of (15) 

Q<Pxxx ^ + 4^x\ (^ ~ ^u) + 4^x(jj ~ ^h) + h.o.t. = 0 . (16) 

It consists of low-order terms of the expansion (16), see [5], p.86. The corre- 

sponding cycles u E U are approximated by the accordingly truncated series 
(10). In (16), it is understood that the differentials of (j) are evaluated at the 
origin, i.e. (t)xxx = (!>xxx (0,0,0), etc. An algorithm for computing of the chain 
of relevant differentials is supplied in [8], and in a larger extend in [13]. 

In next Section 3, we resume the asymptotic formula for a cycle (it is 
already done in [8]) and provide a formula for differential of the cycle with 
respect to x. The latter formula is the cycle Velocity’ and serves as predictor 
step of the cycle continuation. 

We rescale time back to period one, see (3). 
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3 The algorithm 



We set h := x, where x is the reduced state, see (9). 

The solution {u{t),T, X) to (2)&(3) could be represented as 

u{t) =uh + 2/i 5ft(exp(27Tit) [vx]i) + 5X [ua]o + 

QKx]o + 5ft(exp(47Tit) [vxx]2)^ + O(h^) , (17) 

T=—, LU = iOH + Sco + 0{h^), (18) 

UJ 

A = Aiy + a + 0(/i^) (19) 



for small parameter h. The increments ^A and Scv are the solution of a linear 
system 



\Slu J 6 V ^^xxx ) ’ V / 



(20) 



The objects [vx]i, [^a]o, [vxx]o, [vxxh are vectors in and (t)xxx, 4 >xuj are 
complex constants, see [8], p. 1167. They are computed at a particular Hopf 
bifurcation point uh, Xh, ojh> 

Truncating the higher order terms in (17), (18) and (19) we get the 2nd- 
oder cycle approximation. 

Let us differentiate the cycle (i^(t),T, A) with respect to h. The resulting 
differentials [£u{t), £X, £T) can be considered as cycle velocity. For the 
components of cycle velocity we obtain the following asymptotic formulae: 



A^i(^) - 2 5ft(exp(27rit) [ux]i) + 5X [va]o + 

+2/i (^Kxlo + 5ft(exp(47Tit) [uxx] 2 )) + 0(/i^) , 

dA = « + 0(k»). 

where the increments 6X^ 6lj satisfy 

g ^ ^A A ^ ^ 

y SlO J ^ \ ^4^xxx J 



(21) 

( 22 ) 

(23) 



(24) 



Truncating the higher order terms in (21), (22) and (23) we obtain the 
Ist-oder cycle velocity approximation. 



Remark 1. Assuming the bordering (12) or (14), it comes out that [vx]i = 
Actually, for a generic bordering, [vx]i is a positive multiple of 
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Truncating in the expansions (17), (18) and (19) the 2nd-order, and in the 
expansions (21), (22) and (23) the Ist-order terms, we get 

u^{t) =uh + 2h^{exp{27Tit) [t’a^Ji) , = 2^{exp{2mt) [vx]i) • (25) 

The claim is that choosing L properly, we get (4) and (6). In fact, due to 
Remark 1, the choice e.g. L = 2^^ or L = 2r]*jj would do. 



4 Numerical tests 

We consider 

y = y{ex -jz-p), 
z = z{a-fy - 6) , 

which depends on ten parameters. This dynamical system models corruption 
in democratic societies, see [12]. In [6], p. 317, the system was investigated for 
the parameter setting 

a = 1.5, j3 = 0.5, A: = 1, p = 0.1, fi^ = 1, /i“ = 10, 7 = 1, cr = 2 , 

leaving two parameters e and d free. There were detected Generalized Hopf 
bifurcation point (GH) as codim = 2 organizing center and a branch of Hopf 
bifurcation points containing this GH. We consider the particular Hopf point 
with ordinate e — 0.2. Coordinates of this point are uh = [0.7623; 0.3588; 

0. 0524] and Sh = 0.7176. On Fig.l, there is a branch of limit cycles emanating 
from {uh, dn)- There is a turning point on the branch {LPC)\ the detection of 
bifurcation points was switched off. The zoom of two first cycles is shown on 
Fig. 2. It illustrates the action of initialization procedure init_H_LC, see Section 

1 . 

All computations were performed using CL_MATC0NT, see [1] and [2]. The 
toolbox works really nicely. 

In order to compare the action of init_H_LC with the 2nd-oder predictor, 
we coded the algorithm from Section 3 as an m-file called init_H_LC_new. It 
is supposed to replace init_H_LC in the cycle continuation. On Fig. 3, there 
is a sequence of the 2nd-oder cycle approximations due to init_H_LC_new for 
selected /I’s. The bordering (12) is considered. Fig. 4 compares the actions of 
init_H_LC (dotted) with the actions of init_H_LC_new. Note that the dotted 
cycles do not advance with parameter 5. 

The parameters of both init_H_LC and init_H_LC_new are the already men- 
tioned h and collocation data setting. The latter will be fixed qua nstol = 10, 
ncol = 4. Let us consider the particular Hopf point {uh^ Sh). 

Using init_H_LC, one needs to set h < 0.0003 to achieve a successful 
continuation. Choosing h too small, say h < 0.00001, the continuation is 
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Fisr. 1. Continuation of the limit cvcle via orthogonal collocations 




Fig. 2. Zoom - the initial cycle (solid), the next cycle (dotted) 



not initialized either due to the round-off errors. On the other hand, ap- 
plying init_H_LC_new with bordering (14), the continuation is successful for 
h < 0.002. 
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d 

Fig. 3. The 2nd-order approxim< 




di 



Fig. 4. The Ist-oder (dotted) vs 
h-0.1 
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Conclusion: Using the 2nd-order predictor we can afford to take a larger 
amplitude h. It means, we can skip over the initial stages of the continuation. 
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Summary. The number of real operations necessary to reduce a quaternion-valued 
matrix into a similar upper Hessenberg matrix by making use of quaternion-valued 
Givens’ transformation matrices is relatively large. Two possibilities of how to reduce 
the columns of a quaternion- valued matrix more effectively are presented. 



1 Introduction 



In the literature, several applications of quaternions and quaternion- valued 
matrices can be found. In most of them, the quaternion- valued matrix has to be 
factored. One conceivable reason is that factorizations yield more time-efficient 
processing schemes. Another reason is that factorizations can be chosen so as 
to provide necessary geometric relationships in certain applications. 

Let us give an example, which arises in quantum mechanics, see [1], [2]. 
In particular, solving the Secular Equation, we have to find an eigensystem of 
a matrix A. If the nonrelativistic or spin-orbitless problem is considered, the 
matrix A is real and symmetric. The effect of including the spin-orbit coupling 
into the model is to replace each scalar matrix element with a 2 x 2 complex 
matrix of the form 




As a consequence, the matrix A becomes complex, in general nonhermitean 
and its size is doubled, i.e. the smallest eigenvalue (~ 10~^) occurs twice. In 
numerical calculations, computational noise is dramatically increased. 

Let M be the space of such complex 2x2 matrices: 



M:={h = 



a (3 
—f3 a 



a = ai +ia 2 , P = as +ia 4 , ai,a 2 ,as,a 4 G M} 



and let M be the space of quaternions: 



H := {/i = (ai,a2,a3,a4) : ai,a2,aa,a4 G R} . 
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Since El and HI are isomorphic, see [ 4 ], we can replace 2 x 2 complex matrices 
with corresponding quaternions and use quaternion arithmetic in calculations. 
The arithmetic of quaternions is, of course, more complicated, but it allows us 
to economize the storage and to increase the accuracy of results, see [1]. 

In this paper, our aim is to develop an algorithm for a reduction of a 
given quaternion- valued matrix into its upper Hessenberg form. At first, we 
shortly review the algebra of quaternions, see [6]. We also briefly mention the 
algorithm of Given’s reduction of a quaternion- valued vector x G EI^. It can be 
found in [ 3 ], where also all necessary facts are proved in detail. Then, Givens’ 
transformation matrices for a vector x G EI’^ are defined. We discuss some 
possibilities of reducing the number of necessary real operations. Finally, the 
number of real operations for the reduction of quaternion- valued matrices into 
a similar matrix in upper Hessenberg form will be calculated. Our aim is to 
show in particular that it is necessary to develop an algorithm, which reduces 
the large number of operations. A numerical example of Givens’ reduction of 
a quaternion- valued matrix into a similar upper Hessenberg form is presented. 



2 Short review of the algebra of quaternions 

Let HI = be equipped with the ordinary vector space structure and with 
an additional multiplicative operation El x El — El which most easily can be 
defined by a multiplication of the four basis elements 

( 1 , 0 , 0 , 0 ) = 1 , ( 0 , 1 , 0 , 0 ) = i, ( 0 , 0 , 1 , 0 ) =j, ( 0 , 0 , 0 , 1 ) - k : 

i 2 = j 2 = k 2 = ijk = -1 . ( 1 ) 

An element x = (^i, 3^2, ^3, ^4) G El has the representation 

x= xil-[-a:2i4-X3j4- X4k, (2) 

where xi,X2,xs,X4 G R, == x\ is the real part of x. We will identify 
the quaternion x = (xi, 0,0,0) with the real number Xi, the quaternion x = 
(xi,X2,0,0) will be identified with the complex number xi 4 - ix2. For x = 
(xi, X2, X3, X4) G H, y — (yi, 2/2, 2/3, 2/4) G El it follows from ( 1 ) that 

xy = {xiyi - X22/2 - xsys ~ x^y^) 1 + {xiy2 + X22/1 + X32/4 - x^y^) i 
+(^ 12/3 - X 2 yA + ^ 32/1 + ^4^2) j + (3:12/4 + 3:22/3 - ^ 32/2 + 3:42/1) k. 

Obviously, in general, the multiplication is not commutative. 

Given x according to (2), the conjugate x of x is defined to be 



X = (xi, -X2, -X3, -X4). 



( 3 ) 



Let us note that conjugation obeys the rules 

Ty — y x^ X = X. 
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We define the absolute value of x by 

\x\ = yjxl + xl + xl+xl- (4) 

We will use the following properties of |a:|: 

\x\'^=xx = xx, \xy\ = \yx\ = \x\\y\. (5) 

Here, we use the convention that a real number x may be identified with the 
quaternion (a:, 0,0,0). 

The space M is a normed vector space over H, where the norm is introduced 
in (4). 

Let us remark, that for any x € H\{0} an inverse quaternion x~^ is defined, 



Let X = (xi, X2, X3, X4), y = (yi, 2/2, 2/3, 2/4) ^ be two quaternions. Then 
^{xy) = xiyi - X 22/2 - ^32/3 - ^4^4 = (7) 

an equation which will be used later. 

Two quaternions x and y are called equivalent, denoted hy x y/\i y = 
a~^xa^ for some a G M\{0}. For fixed x G M the set 

[x] = {y gM: y = a~^xa for a G HI\{0}} (8) 

is called equivalence class of x. 

Lemma 1. Two quaternions x and y are equivalent if and only if 

Rx = 3^2/ \x\ = \y\. (9) 

Corollary 1 . Let x = (xi, X 2 , X 3 , X 4 ). Then 

y = (a:i, \Jxl + xl+xl,0,0) G [x] 

is the only complex element in [x] with non negative imaginary part. 



Let X — (xi, X 2 , . . . , Xn)^ G and define the norm ||x|| of x by 



X = 






E 

J = 1 



(10) 



The space becomes a normed vector space over M with the norm defined 
in (10). For x G we denote by x* the transpose of the entrywise conjugate 
of X. 
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We define the conjugate transposition of the matrix B = (bij) G as 

B* = {bji) G . The square quaternion valued matrix B G is called 

Hermitean if B = B*, and B is positive definite if it is Hermitean and 

x*Bx>0 VxgM^\{0}. 

A matrix B G is said to be unitary if B*B = I. 

Theorem 1. A matrix B G is unitary if and only if ||Bx|| — ||x|| for 

all X G 

Definition 1. Let B G If there exist a vector x G E[’^\{0} and a quater- 

nion A G M such that 

Bx = xA, (11) 

we call A an eigenvalue of B and x an eigenvector corresponding to A. 

The number of eigenvalues of a quaternion valued matrix B G is, in 

general, not finite. If A is an eigenvalue, one can easily show that the whole 
equivalence class [A] consists of eigenvalues. If A is real, then [A] = {A}, if A 
is not re^l, then according to Corollary 1 there exists exactly one complex 
number A G [A] with positive imaginary part. It can also be proved that the 
number of non equivalent eigenvalues is at most n. 

Theorem 2. Let A G be Hermitean. Then A has only real eigenvalues 

and their number is n. 

Theorem 3. For any unitary quaternion valued matrix A, the eigenvalues A 

satisfy 

|A| = 1. 



3 Givens’ transformation of a vector x G 

Let X G M^\{0} be given. Let G be a matrix of the form 

G = (_^ c ) ^ • 

Our aim is to find c and s such that 

(i) G is unitary , 

(ii) G*x = itei, where u G EI\{0} , ei = (1, 0)"*^ . 

The solution is contained in the following theorem. 

Theorem 4. Let x = (xi,X 2 )^ G EI^\{0} be given. Define 



( 12 ) 




514 D. Janovska, G. Opfer 



X2 I I 1 

S (7 C <7 . (J 1, 

l|x|| ||x|| 

where a is arbitrary in case x\ = a X 2 ^ot ^ (R)\{0}. Otherwise we have to 
choose a E E, where 

^ = = H + l/>l>o}. 

Then G is a unitary matrix, 

G*x = u = cr||x||(l, 0)"^ 

and there are no other unitary quaternion- valued matrices satisfying this con- 
dition. 



Proof. See [3]. 

Remark 1. The set E is not empty, since it contains the subset {±sgnTT, ± 
sgnx 2 }. But it is different from the whole unit sphere, since ±1 do not, in 
general, belong to E. Let us note that s is real for the choice a — ±sgn^ and 
c is real if a = ±sgn^. 



4 Givens’ reduction of a vector x G 

Let X = {xi, . . . ,Xn)^ G H^\{0}. For 1 < i < j < n, we define the Givens’ 



rotation matrix G^- G 

/i. 



G* = 

J 



as follows: 



\0 



0\ 



• 1 / 



(13) 



where 



= 



(7 iX q 



Xi\^ + \Xj\^ 



CrXi 



+ byP 



(14) 



(Ji G H, \ai\ = 1 , Gi is arbitrary if Xi — axj, a G R\{0}. Otherwise, Gi G Ei, 



Ei = {g e M : G ■■ 



axi /3 Xj 

\axi + f3xj\ 



, a,/?GR, |a| + b| >0}. (15) 
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Then is a unitary matrix and 



(Gj; ) X , . . . , , Ui^ 5 • • • 5 ^j—1 1 •)'•••) ^n) 5 



Ui = <yisj\xi\^ + \xj\'^ . 



Let US suppose that our aim is to reduce all components of a given 
quaternion- valued vector except the first one. We will discuss two possibili- 
ties of the ordering of such a reduction. 

Let us reduce components in the order from bottom to top. In order to 
decrease the number of operations, all matrices necessary for the reduction 
can be multiplied to obtain just one matrix of the reduction. 

Theorem 5. For a given vector x = (xi, . . . , Xn)^ G Xn / 0, let 

G = GrlG”:^••G2Gl. (16) 

Then, there exists cr G M, |cr| = 1 such that 

G*x = (cr||x||,0, . . . ,0)^ G . 

The unitary matrix 

G* = (Gi)*(Gi)*---(G::: 2 )*(Gri)* 



has upper Hessenberg form. 



Proof. All matrices GJ- in (16) are unitary, so is their product. 

fn 



Let us set Qi = 




’■|2 



1 ,.. 



n — 1. Since Xn ^ 0, also Qi 0 for 



all i. 

Let us start from the first reduction. We construct reduce the last 

component of x G W^: 



1 T n 

(G(J“ )* {xi,X2,. ■ . ,Xn-2,Xn-l,Xn) = {xi,X2, . ■ . ,Xn-2,Un-l,0) 



where 



^n—l — 



^n— l^n— 1 “1“ Pn—l^n 



\oin—l^n—l “t~ Pn—l^n\ 
In the next step we continue: 



^n—l’)Pn—l G M, T l| ^ 6. 



(G^_l)* (xi, X2, . . . , Xn-2, 0) = (xi, X2, • • . , Un-2, 0, 0) , 



where 



^n—2 — ^n— 2 \/l^n— 2 P 4" |'^n— Ip — ^n—2Qn—2 7 
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^n—2 



-2^n — 2 “k (^ri—2'^n—l — 2^n— 2 “k f^n—2^n—lQn—l 



\^n—2^n — 2 H“ /^n— 2'^n— ij |<^n— 2^n — 2 ~t~ /3n—2^n—lQn—l\ 

^n—2'i(^n—2 ^ ^7 |<^n— 2I “k 1/^n— 2I ^ 

Now, we form the product (Gn-i)*(^n~^)* continue by reducing iAn- 2 - 
By induction we obtain the following explicit formulae for the matrix G* = 



9id = 

9kj 

9n,n—l — 



a\Xj 






^^+1 • -| D. 

Qi 



_ Xk-lCTk-lOkXj 

— , /i. — Z, . . . , 1 , J 



^n^n — 1 
^n— 1 



9n,n — 



^n— l^n— 1 
Pn— 1 

0, 



where 



^n—l 



^n—1 ^n—1 d" /^n— 1 

l^n— 1 ^n— 1 d" f^n—1 

ai Xi + 



z = 3, . . . ,n, j = 1, . . . ,z - 2; 



/^n-l ^ ^ 5 |<^n-l| d- |/5n-l| > 0, 



, ai,/5i€R, \ai\-\-\l3i\ > 0 for all z = n-2, . . . , 1 , 



\oLi Xi -j- ^iGiJ^\QiJ^\ 

i.e., the matrix G* has upper Hessenberg form and 
G*x-(or||x|l,0,...,0)'^GE 



Remark 2. For a given quaternion- valued vector x G we can construct 
a Householder transformation matrix H, which also reduces all components of 
X except the first one: Hx = uei, zz G M, ei = ( 1 , 0, . . . , 0)"^, but this matrix 
H is full, in general. 

For a given x = (xi, . . . , Xn)"^ G x^ ^ 0, let us choose cr^_i = sgnx^ at 
each step of the reduction. Such ai always belongs to the set Ui^ see Remark 
1. If we set a = (7n-i we obtain 



(Ji = a for z = 1 , . . . , n — 1 



and G*x= (cr||x||, 0, . . . ,0)"^ = (sgna:„ ||x||, 0, . . . , 0)"^ 



where 
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TXl 


crx2 


(JX3 


ax4 


axs 


(TXn 


|x|| 


||x|| 


l|x|| 


INI 


INI 


INI 














Q2 


XIX2 


XIX3 


XIX4 


XIX3 


XiXn 


Qi 


Q1Q2 


Q1Q2 


^1^2 


Q1Q2 


Q1Q2 


0 




X2^ 




X2~^ 


X2~^ 


Q2 


Q2Q3 


Q2Q3 


Q2Q3 


Q2Q3 



0 



V 



0 



Qn—1 ^n—2^n—l ^n—2^n 

Qn—2 Qn—2Qn—l Qn—2Qn—l 

Q XriCT Xyi—\(J 

Qn—1 Qn—1 / 



The elements below the main diagonal are real. 

If we reduce x in the order from top to bottom, a theorem similar to 
Theorem 5 can be proved. In particular, 

G*x = (Gi)*(Gi_i)* ■ ■ • (Gi)*(Gi)*x = (ct||x||,0, . . . , 0)^, 

the resulting G* is not an upper Hessenberg matrix, but it has “nearly trian- 
gular form” : 

/ * * * * . . . * \ 

* * 0 0 0 

* * * 0 0 

: 0 
\ * * * . . . * / 

the diagonal elements except the first one are real. 



5 A reduction of an arbitrary quaternion-valued matrix 
into upper Hessenberg form 

In order to reduce an arbitrary matrix Y = G to upper Hessenberg 

form there are (n — 2)(n — l)/2 elements to be reduced. They are usually 
reduced step by step for example in the order 



l/31,y41, • • • ,yni;y42,y52, • • • ,yn2‘, • • • ;^nn-2* 

Corresponding Givens transformations G^-, see (13), are then applied to k—th 
column of Y, /c = 1, . . . , n — 2: 

^ •“ {Vlki • • • 5 Vik^ • • • 7 yjkt • • • 5 Vnk) 5 



i z= k + 1] j = /c + 2, . . . , n. 
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Let I < io < jo < n he fixed. Let k—th column of the matrix Y plays the 
role of X in (14), (15). 

The corresponding Givens’ matrix for the reduction of the element yj^^k 
has the form 



GZ = i9ij) e 



jrnxn I 9 ioio Qiojo 
9 joio 9 jo jo 



9 ij 



Sij otherwise . 



Here (for a properly chosen cr^o), 
^ioVjok 



^ioViok 






Let us count the number of real operations necessary for the reduction of 
an arbitrary matrix Y G to upper Hessenberg form. Moreover, let our 

resulting upper Hessenberg matrix be similar to the original one, so that we 
can use the reduced form to compute the eigenvalues of the matrix Y. Let us 
remark that 1 addition of two quaternions needs 4 real flops, 1 multiplication 
of two quaternions needs 28 real flops. For the reduction of one element of Y 
we have to perform two matrix multiplications; 



V = (G}°J*Y 

V 

Jo 



4n multiplications 
2n additions 
4n multiplications 
2n additions 



of quaternions 
of quaternions 



All together, the reduction of one element of Y needs 240 n real flops. We can 
substantially reduce the number of operation (to 144 n real flops) if s or c is 
real. 

To obtain the upper Hessenberg form we have to reduce (n — 2) + (n — 3) + 



•+ 2+1 



(n — 2)(n — 1) 



elements. Let T be the product of the corresponding 



Givens’ matrices: 



i^2 i^n—2 

— ‘ ‘ ■ ’ ‘ ’ W W 

We denote the resulting upper Hessenberg matrix by W. After all reductions 
we obtain 



W = r* Yr , W ~ Y , W is an upper Hessenberg matrix. 

Example 1. The elements of the following 5x5 quaternion- valued matrix Y 
are chosen randomly (integer elements in (—5,5)). Let Y = 



/ y\i 


V12 


y\3 


V 14 


yi^ 


J /21 


3/22 


3/23 


3/24 


3/25 


3/31 


3/32 


3/33 


3/34 


3/35 


3/41 


3/42 


3/43 


3/44 


3/45 


3/51 


3/52 


V 53 


3/54 


3/55 \ 


5 


4 


3 


-3 


-2 


0 


0 


1 


4 


-5 


3 


3 


-4 


-2 


2 


4 


1 


-1 


3 


3 


-5 


-2 


5 


-2 


-5 


0 


2 


3 


4 


0 


0 


0 


-4 


-1 


0 


-3 


-3 


1 


0 


-4 


-5 


4 


-1 


4 


-1 


-1 


-2 


-2 


-4 


0 


-4 


5 


4 


2 


-4 


-3 


3 


-4 


2 


4 


2 


-3 


-3 


3 


3 


-4 


0 


-1 


3 


-1 


3 


0 


3 


-2 


1 


V -4 


3 


-4 


-1 


0 


4 


2 


1 


-4 


3 


-3 


0 


4 


-2 


0 


2 


0 


4 


4 


-3 


-2 


-3 


4 


2 


2 / 
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Then W r*YT = 



/ -^13 -^14 -^15 -^21 '^22 '^23 '^24 '^25 



5.0000 1.3205 - 2.9935 1.7707 2.3056 

0 2.7692 1.6199 - 5.2047 1.0482 

- 4.0000 6.5897 - 0.8168 0.7781 3.7071 

V - 4.0000 1.5513 5.8488 0.7602 0.8553 



10.0000 0.0833 - 0.8560 - 3.3842 - 0.4310 

2.0000 3.5534 4.3072 4.6697 1.8606 

- 6.0000 1.9352 - 3.9121 - 1.1416 - 3.9023 

4.0000 - 0.1335 4.6311 - 0.2449 0.6140 



-^31 


‘^32 


^33 "^34 '<" 35 '^ 41'^42 


■^43 


u;44 


^45 


ix;51 ^52 1x^53 


'“^54 


-^ 55 \ 


0 


- 4.6098 


- 4 . 4234 - 0 . 9866 - 1.7903 


0 


0 


- 2 . 4908 - 


2.4512 


1.5053 


0 


0 


0 - 


2.0766 


0.7912 


0 


- 3.7619 


4 . 5482 - 2 . 9750 - 2.0253 


0 


0 


- 2.2011 


2 . 3583 - 2.0050 


0 


0 


0 


2.9598 


0.6768 


0 


- 3.4546 


- 2.0860 0.5185 3.4334 


0 


0 


0.7713 


3 . 6406 - 2.9308 


0 


0 


0 - 


4.8980 


4.6884 


0 


- 6.2755 


1 . 5631 - 0 . 8724 - 2.2988 


0 


0 


6.8124 


1 . 1480 - 1.3262 


0 


0 


0 - 


5 . 2663 - 


- 0 . 0303 / 



Thus, the structure of W is 



( ***** \ 

0 * * * * 

0 0 * * * 

^ 0 0 0 * * y 

just an upper Hessenberg matrix. 



6 Conclusions 

We have developed an algorithm for reducing a quaternion- valued matrix into 
a similar upper Hessenberg matrix by making use of quaternion- valued Givens’ 
transformation matrices. The algorithm was written and tested in MATLAB. 
It is a generalization of algorithm given in [3]. The number of real operations 
necessary to perform this reduction is still too large. We have suggested to 
reduce the number per each column in Section 4. 

We are convinced that a good way to reduce the number of real operations 
is to try to apply Fast Givens’ algorithm, which is used in the real case. 

A complex version of Fast Givens’ algorithm can be found in [5] (written 
in Chinese). This is the only paper dealing with Fast Givens’ transformations 
applied to complex vectors, which we were able to find. 
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Summary. In the contribution, an introduction of the model of compressible flow, 
transport of mass and energy, and production of energy in a time-dependent domain, 
which is being developed at Technical University of Liberec, is done. The main focus 
is concentrated on the flnite volume model of advection-diffusion mass transport. The 
explicit upwind scheme is used, therefore the computational time step is restricted by 
the stability condition and the phenomenon of numerical diffusion may be significant. 
A scheme for reduction of numerical diffusion is proposed and numerical tests are 
presented. 



1 Introduction 

The final aim of our modelling is to predict production of nitrogen oxides from 
an internal combustion engine. For this purpose, a precise chemical reaction 
model should be developed. Its main inputs would be fiow, temperature, and 
pressure fields and their development in time. This contribution is devoted to 
a model of relevant physical processes, which is being developed at Technical 
University of Liberec and whose outputs would form such inputs to the precise 
chemical reaction model. It is being built as a model of compressible fiow, 
transport of mass and energy, and production of energy in a time- dependent 
domain. 

A short overview of the model is done in the next section. The rest of the 
paper a bit more precisely discusses the finite volume mass transport model, 
a method of reduction of numerical diffusion, and its testing. 



2 Overview of the model 

The volume and shape of the cylinder of engine changes in time. To simplify the 
problem, we discretize it in time and in each time step we split the solution into 
two stages, isochoric and adiabatic one. All modelled processes are computed 
in the isochoric stage, supposed to take place in a fixed domain: 

* This work was supported with the subvention from Ministry of Education 
of the Czech Republic, project code 242200001. 
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— production of mass and energy by chemical reactions, 

— compressible flow of gas mixture, 

— mass and energy transport. 

All of them are discretized in time and space and solved by either finite volume 
or finite element method. The spatial computational mesh, built up of trilateral 
prismatic elements /volumes in layers, is common to all models (see Fig. 1). 

The adiabatic stage models an immediate change of volume. Its key pro- 
cedure is change of computational mesh. A more detailed description of the 
setting of the model can be found in [5] . 




Fig. 1. A schematic example of computational mesh in four time steps 



2.1 Model of chemical processes 

All chemical processes are described by the set of stechiometric equations where 
each one can be written as 



CLlAi + a2A.2 + . . . + anj^Anji biBi + 62^2 + . • • -h bnpBnp + Q, 



where and hj are stechiometric coefficients of reagents Ai and products Bj , 
respectively. During this reaction, the heat q is produced. It can be positive 
or negative depending on the type of such reaction. Local kinetic equations 
for a computation of mass and energy production can be expressed by linear 
ordinary differential equations 



dcj 

dt 



= —Rmiai, 



dt Cv' 



where nrii is the molar mass of reagent Ai and Cy is the heat capacity of the 
gas mixture. There are an application of the simplifying assumption that the 
reaction rate R depends only on the mass fraction of a chosen gas compo- 
nent and results of its calibration presented in [4]. One calibration result is 
illustrated on Fig. 2. 
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Fig. 2. Results of calibration of simplified reaction model. Bold lines are computa- 
tional results, thin lines are measured data. 



2.2 Fluid flow model 

The model of flow of compressible gas mixture is governed by the set of Navier- 
Stokes equations, the continuity equation, and the state equation of perfect 
gas: 

+ (v • V)v = 1 /Z\v + z>V(V • v) - iVp, 

||+V-(^v)=0, 

p = RT Q, 

Here, v is the velocity vector, p pressure of gas, g its density, and u and i> are 
viscosity coefficients, R is the molar gas constant, and T is temperature. 

Nonlinear system is discretized by mixed hybrid finite element method and 
linearized. Formulation of the model is set in [5] , global behaviour tests were 
succesfully performed, further testing is being in process. 

2.3 Model of mass and energy transport 

Let i? C represents the domain of the interior of the engine cylinder (since 
all processes are solved as isochoric processes in the fixed domain, we also omit 
the time dependence of i? in our notation) . The boundary of f2 is divided into 
two disjoint parts /in and Tex- The part /in represents the inlet part of the 
boundary, the remaining part is denoted by Tex- The problem is solved in the 
time interval (0, t) 

The mass transport is governed by the set of mass balance equations for 
each component of a gas mixture [1] : 

-b V • (fci -1-ji) -|-Ci7- = Ci,*l+ in n, i = l,...,N (1) 

in a given flow field f (x, t) for unknown functions ^(x, (i = 1, . . . , //). 

N is the number of components of gas mixture, 7 _j_ denotes the density of 
sources (with defined mass fractions Ci^^) and is the (positive) density of 
sinks. The diffusion flux ]i is given using the effective diffusivity Vi by the 
Fick’s-law-like relation = —gViVci slightly corrected as proposed by Sutton 
and Gnoffo in [6]. 
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The energy transport is derived from the balance of the internal energy, 
which can be written with respect to the temperature as [1] 



Cv 



' djgT) 

dt 



+ V • (fT) 



-V • (/cVT) 4- TCv7-=T^Cv7++ in O. 



Here, T* is the temperature of inflowing fluid. The terms and express 
reversible and irreversible rates of conversion between mechanical and internal 
energy and can be written as 



dxi dxj 3 ^ dxi ’ 



where fi is viscosity coefficient. In these formulas, the ideal behaviour of gas 
and the Newtonian fluid are supposed. 

The boundary conditions are set of Dirichlet type at inlet part of the bound- 
ary: 



c^x, t) = ^ T • • • , 1 

g(x,t) = QD{x,t), > X G Tin, t G (0,t) 

T{x,t) =TD{x,t), J 



and of Neumann type at the rest part of boundary: 

Vq(x, t) • n(x) =0, i = 1,. . . ,N 
VT(x, /) • n(x) = 0, 

where n(x) is the outward normal to Fqx- 
The initial conditions are set as 



X G Fe : 



t G (0, ?), 



( 2 ) 



(3) 



^(x,0) - ^o(x). 



Ci(x, 0) = co,i(x), i = 1, . . . , A^, T(x, 0) = To(x). (4) 



3 Numerical model of mass transport 

In this section, we focus on the numerical model of mass transport. The energy 
transport model is very similar. 

Before starting with the numerical scheme, let us make a note about the 
space decomposition. The presented model use the meshes consisting of tri- 
lateral prisms with parallel bases. The base of the cylider is decomposed into 
the set of triangles and then is the triangulation extruded along the z axis up 
to the height of the cylinder. It should be noted that the 2-D mesh must sat- 
isfy some conditions needed for a consistency of the numerical scheme. These 
conditions can be found e.g. in [2] as the definition of admissible mesh. 

Using the notation from [2] , let us introduce the set of control volumes T, 
the set of their faces 8 and the set of points V such that each point xk € V can 
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be uniquely assigned to one control volume K eT. Suppose the mesh {T^E^V} 
to approximate domain f2 and to be admissible. Let us denote m(K) volume 
of FC G T, m((j) area of cr G E^ Af{K) the set of adjacent control volumes of 
K E T, Ek the set of sides common to the volume K. For adjacent control 
volumes K and L let K\L G Ek be their common side. Let (Ik, a be the distance 
between the point and the side cr. Let dK\L denote the distance between 
x/c and x^. 

The time discretization is realized by the ascending sequence of time values 
(^n)nGNo 7 ^ 0=0 with the time step Atn = tn+l ~ 

Explicit finite volume scheme of the problem (l)-(4) can be written as 
follows: 



— rP 



cr^,(T,+ 



( 5 ) 



(tESk 



ae^K 









i = 1, 



,iV, KeT, 



where g'^j^ = q'kC^k i^eans partial density of specie in the finite volume K. 
We use such a notation that e.g. g^ approximates g{yiK^'tn)- Mass fractions 
in the advective and source term are expressed following the upwind scheme 
as 



Jl 

i,K 


IV 


ji 




'i,L 


a = K\L(^ da, 


,n 

'i,a,D 


<0, aCTi 






The diffusion term is approximated by 

m(cr) 

K\L 

l(^ 






cIk if7]^<0. 



Fi^K,a = \ 



iora = K\L(tdQ, 



- cIk) for CT C Ti„, 
0 for cr c Fex. 



( 6 ) 



( 7 ) 



Final decomposition into the density and mass fraction is computed as follows: 

N 









^71+1 



i=l 






The scheme is conservative. The main disadvantage of the proposed model is 
the strong restriction for the selection of the time step Atn- A simple analysis 
shows that the condition 



Atn < 



gm{K) 






m{K\L) yj^n 






LeJ^{K) 



d 



K\L 



+ 9 E 

creSK 



m{cr) 
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for all jFf G T and i = ^ N assures stability of the scheme in incompressible 

stationary flow field. It leads to the h^-stability of the scheme {h has meaning 
of the largest diameter of finite volumes of the mesh). 

One more disadvantage of the scheme arises from using of the upwind ap- 
proximation of the advective term: numerical diffusion. The analysis of 1-D 
scheme in [3] shows that the 1-D upwind numerical solution of the advection- 
diffusion equation with diffusivity V with equidistant finite volume mesh ap- 
proximates the exact solution with the diffusivity V + T^num? where 

2^num = ^Vh ^1 - •y^) • (8) 

Here v is the velocity and h is the length of finite volumes. 

Our approach to the elimination of numerical diffusion is just to subtract 
the estimated numerical diffusivity from all physical coefficients. The numer- 
ical diffusivity is estimated on each inter- volume face of the mesh separately. 
The substraction cannot be performed arbitrarily but a non-negativity of the 
resulting coefficients must be achieved, i.e. in the scheme, the coefficient 

T>red = max{P - Vnurn, 0} (9) 



is applied instead of V. 

Additionally, the estimate (8) of T^num should be extended to more- 
dimensional case. It is made by using the same formula (8) and estimating 
the 1-D parameters v and h in the following way: On each face a = K\L, v is 
set equal to the magnitude of the dominant velocity in small neighbourhood 
and h is estimated as the length of the projection of the vector (x^ — ^k) to 
the direction of the dominant velocity. 



4 Tests of numerical diffusion 

4.1 1-D test 

Our 1-D test problem of advection-diffusion transport of coloured fluid is 
dc Qc d^c 

— {x,t)+v—ix,t)-T)-^{x,t) = 0,{x,t) e (-oo,+oo) X (0,i) (10) 
c{x^0) — M6{x — xq), (11) 

where S{x) denotes the Dirac function. The analytic solution of the problem 
is 

jx-XQ-vt)^ ] 

J ’ ^ 

where M is the initial mass of coloured fluid in the domain. 



,{x,t) = 



M 



2\ZTTVt 



exp 
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Fig. 3. 1-D test mesh 



The solution of the problem (lO)-(ll) is compared with numerical solution 
of its 2-D finite-domain approximation: 

dc 

— {x,y,t)+Wc{x,y,t)-VAc{x,y,t) = 0, {x,y,t) e (0,d) x (0,w) x (0,i) (13) 



5c(x,0,t) 

dy 

c{Q,y,t) 



dc(x^w,t) _ g (0,d),t e (0,i) 
^2^5^ = 0,2/ € {0,w),t G (0,t) 



c{x,y,0) 



1, {x,y) € Ko 
0, {x,y) ^ Ko ’ 



(14) 

where Kq is the triangle including point (xq^w/2). The problems are compa- 
rable, if parameter M in (11) has meaning of area of the finite volume Kq (see 
Fig. 3). 

The tests were performed using the following parameters: d = 20, xq = 
d/4 = 5, h — d/N (where N is the number of control volumes). The mesh 
consisted of equilateral triangles, therefore w = hcos 

In Tab. 1, there are results of several computations for comparison summa- 
rized. The error E is computed as the Li norm of the difference of analytical 
solution (12) and the numerical solution of (13)-(14): E = ||cnum (^5 2/,0 ~ 
Can (^7 0llLi((o,d)x(o,ii;))5 where Cnum was Computed numerically either without 
[£^(without)] or with [E(with)] the proposed reduction of numerical diffusion 
(9). 



4.2 2-D test 

The other presented test problem is a 2-D extension of the previous one. It is 
given as 



— (x, t) -f V • (c(x, t)v) - W^c(x, t) = 0, (x, ^) G X (0, t) (15) 
c(x, 0) = MS{\x - xo|) (16) 



4^ exp 



|x— xo-vt|^ 
4Vt 



The solution 



with the analytical solution Can(x, t) 
is compared with numerical solution of the finite-domain approximation of the 
problem (15), (16) defined as follows: Governing equation (15) is supposed 
to hold in the domain f2 shown on Fig. 4. The domain is decomposed to 
the set of 800 equilateral triangles (their side length is 1). The point Xq = 
(5, 5y/3/2). The boundary and initial conditions of the finite-domain problem 
approximation are: 
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Table 1. Comparison of numerical results of 1-D tests for various input parameters 
N (number of elements), v (velocity) and V (diffusivity) performed without and with 
the proposed reduction of numerical diffusion 



N 


V 


V 


F (without) 


F(with) 


£^(without) 

S(with) 


20 


1 


0.1 


8.49 • 


10“® 


7.61 • 




11.15 


20 


1 


0.2 


6.54- 


10“® 


5.86 . 


10“^ 


11.15 


20 


1 


0.4 


4.40 • 


10-® 


3.58- 


10“^ 


12.28 


40 


1 


0.1 


3.21 • 


10-® 


2.06 • 


10"'* 


15.58 


40 


1 


0.2 


2.23 • 


10“^ 


1.33 • 


10-^ 


16.78 


40 


1 


0.4 


1.35 • 


10"® 


7.24- 


10-5 


18.68 


80 


1 


0.1 


1.11 • 




4.83 • 




22.96 


80 


1 


0.2 


6.97- 


10-“* 


2.35 • 


10“® 


29.61 


80 


5 


0.2 


1.73 ■ 




8.98 • 


10-® 


19.22 


80 


5 


0.4 


1.26 • 


10“® 


5.76 • 


10“® 


21.87 


80 


10 


0.4 


1.73 • 


10“® 


8.98 . 


10"® 


19.22 



Vc(x,i) • n(x) = 0,x e Tex,i € (0,i) , , _ J 1, x € i^o .-.-x 

c{x,t) = 0,x e Fin,t e {0,t) ' ’ \0,x^Ko’^ 

where Kq is the finite volume including point xq and the boundary is split into 
inlet and rest part due to direction of velocity. 




The solution is computed for three different homogeneous velocity fields 
differing by direction of the velocity (on Fig. 4, they are denoted as vi, V 2 and 
V3), magnitude of velocity and diffusion coefficients. The error is computed 
analogically to the 1-D case: E = ||cnum(x, t) — Can(x, t)||i;,^(j 7 )- The comparison 
results are collected in Tab. 2. The notation is the same as in 1-D case. 

From Tab. 2 it can be seen that for advection-dominated transport, our 
method of reduction of numerical diffusion is not very efficient: In case of the 
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Table 2. Numerical results of 2-D tests for various magnitudes and directions of 
velocity v and diffusion coefficients V 



|v| 


direction 


V 


E (without) 


E(with) 


£J(without) 

£:(with) 


1 


1 or 2 


0.2 


5.66 • 


10“® 


7.81 • 


10-® 


7.25 


1 


3 


0.2 


7.83 • 


10“® 


1.74- 


10~® 


4.50 


1 


1 or 2 


0.1 


9.21 • 


10“® 


1.59 • 


10-® 


5.79 


1 


3 


0.1 


1.26 • 


10-^ 


5.10- 


10-5 


2.47 


5 


1 or 2 


0.2 


1.42 • 


10“^ 


6.97- 


10"® 


2.04 (34% red.) 


5 


3 


0.2 


2.05 • 




1.65 • 


1 

o 
^ — 1 


1.24 (63% red.) 



last two tests, the numerical diffusion coefficients respective to 34% or 63% 
mesh sides were smaller than the physical diffusion coefficient and could be 
fully substracted, the resting diffusion coefficients were set to 0 due to (9). 



5 Conclusions 

In the contribution, especially the advection-diffusion model of mass transport 
of gas mixture was discussed. The stability condition and the simple method 
for reduction of numerical diffusion were presented. Numerical results show 
that the proposed method of reduction of numerical diffusion works quite well 
in 1-D and 2-D cases. Since it is not efficient in advection-dominated transport 
problems, we are looking for a more sophisticated and successful one. Therefore 
we plan to concentrate on development of a more acurate numerical scheme 
for this purpose. 

Also other parts of the model are being developed. Especially calibration 
tests of production of energy are continuing with more natural conditions of 
ignition and reaction parameters and local tests of the flow model are being 
performed. 
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Summary. Planar problem of convective filtration of incompressible multi-compo- 
nent fluid saturated a porous medium is studied. The combined spectral and finite- 
difference approach is applied to compute nonstationary regimes and continuous 
families of steady states with variative spectrum. 



1 Introduction 

Convective flows in a porous medium is a subject of many works [1]. For the 
convective filtration of viscous fluid in porous medium (Darcy model) it was 
observed an appearance of one-parameter family of steady states with the 
spectrum, which varies along the family [2]. V. Yudovich shown that these 
families cannot be the orbits of any symmetry group and derived the theory of 
cosymmetry [3, 4]. Investigation of continuous families of steady states may be 
performed only experimentally or numerically. Computer modelling of Darcy 
convection was done by spectral method [5], finite-difference method [6], and 
combined spectral and finite- difference approach [7, 8] . It was found that keep- 
ing a cosymmetry in finite-dimensional approximation of differential equation 
is extremely important to correct computation of continuous families of steady 
states [6, 7]. In the present work we give an extension of approach [7] for the 
convection of multi-component fluid obeying Darcy law. 



2 Darcy convection problem 

The equations of filtrational convection of multi-component fluid [9] in dimen- 
sionless form may be written as 

/3r9l == KrAd"" + \ri>x + = F'"', r = 1, . . . , 

r=l 

6)'’ = </)'■, r = l,...,5, V> = 0 



on &D. 



S. 



( 1 ) 

( 2 ) 

( 3 ) 
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Here, x and y are Cartesian coordinates on a plane, t is time, A is the 
Laplacian, = 'il^xOy — 4’y^x denotes the Jacobian operator, is the 

streamfunction, 9^ is the temperature, 6'^ (r = 2, . . . , S') are concentrations. 
This problem is characterized by the following parameters: Kr — Xr/^, 

Ar = gr]rArl‘^KIi'‘^^r = 1 , . . . , 5, where Ai is Rayleigh number and - pa- 
rameters for concentrations (r = 2 , . . . , S). Here for each r we have expan- 
sion coefficient 77^, gradient of distribution for species diffusivity coeffi- 
cient Xri kinematic coefficient and g is the acceleration due to gravity, I - 
height of enclosure, K - permeability, v - viscosity. We consider the enclosure 
D = [0, a] X [0,6] and pr are given and do not depend on time. 

Cosymmetry for underlying system is given by (7 /;, — acs^s*). Really, 
multiply (1) by 7/; and (2) by sum and integrate over domain V. Then, 

using integration by parts and Green’s formula we derive 



s . s 

-GY I^r6r)dxdy = 0 . 

r=l 



r=l 



(4) 



3 Method of solution 

Spectral and spectral-difference methods are the powerful tools for solving 
problems in mathematical physics [10]. We apply here the spectral-finite- 
difference method in the form derived in [7]. Solution to the problem (l)-(3) 
is seeking in the form: 

m 

Substituting (5) into (l)-(3) and performing projections, we obtain [7]: 

PsO^j = - CjKsOj -h Xs'ipj - PrJj = j = 1, . . . , m, (6) 

0 = t/;'.' - Cj'ijjj - e'f =Gj, j = 1, . . . , m, (7) 

l9j(t,0) = 6>j(t,a) == (/)^ = 'ipj{t,a) = 0, j = 1, . . . ,m. (8) 

Then, we define the grid co = {xk — kh,k = 0 ,...,n, /i = a/{n 1 )} on 

the segment [0,a] and introduce the notion: = 6'j{xk,t), 

J'^ = JJ(x/c,t). Using the centered finite- difference operators of second order 
accuracy on the three-point stencil we obtain a system of ordinary differential 
equations 



nr rxnr j_ nr 

Pr9jk — f^r ^2 



/l2 

I \ V’jj/c+l “ '0i,/c-l 

2X 



Pr^j^k — P 



^Ijk'i 



( 9 ) 
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0 



+ ■ 



“ h? 

nr nr 

- = (p2jk, 



2h 



^j^jk d“ 

r = l,...,S. 



( 10 ) 



Henceforth, the dot is a derivative with respect to t, cj = and JJ^ is 

expressed as 







i,k 



i=l 



tr _ 

^j,i,k 



2i+j 



[ds,k{0i+j,A)-ds,k{^i^ V’i+j)] -^[daM^i+j^'^i)+da,k{^h ’4’i+j)], 






where operators da,k and dg^k are derived in [7] using the requirement that 
discrete version of (4) took place 






2h 2h 



ds,k{^ -W = ^ • 

Excluding stream function we may rewrite the system (9)-(10) in a vector 

form 



dQr 

= {KrA + + L(6)^ A-^BG^), r - 1, . . . , 5, (11) 

where 

/or //Qr nr nr nr \ 

^ — V^ll5 •••5 •••5 ^nm)' 

The matrix A consists of m three-diagonal submatrices, the nonzero entries of 
the skew-symmetric matrix B = {bsr}^^=i are given by fes,s+i = —bg^i^s = 
/i/2,5 = 1 ,..., nm — 1, and L presents the nonlinear terms in (9). 

The system (11) has a trivial solution with zero velocity and linear temper- 
ature and concentration profiles. The integration is performed by the fourth 
order Runge-Kutta method, and the family is calculated by using the algorithm 
[5, 7]. Starting from the vicinity of unstable zero equilibrium we integrate the 
system (11) up to a point close to a stable equilibrium on the family. Then 
of the algorithm to family computation may be formulated as the sequence of 
the following steps. Correct the point using the modified Newton method. De- 
termine the kernel of the matrix of linearization at given point and predict the 
next point on the family. Repeat these steps until a closed curve is obtained. 
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4 Numerical Results 



We explored the derived technique to calculate the families consisting of both 
stable and unstable equilibria. The narrow enclosure (b/a > 1, a = 1) is consid- 
ered and the case of = 1, — 1 is analyzed. We found that a scenario of the 

appearance of a family of the steady states in the case of multi-component fluid 
with co-directed temperature and concentration gradients is similar to the case 
of one-component fluid [11]. Firstly a family consisted of stable steady states 





Fig. 1. Evolution of stationary regimes family; S = 2, b = 2, a = 1, left (A 2 = —5): 
Ai 72 (curve 1), 100 (2), 120 (3), 138.1 (4), right (A 2 5): Ai = 40 (curve 1), 60 

(2), 80 (3), 95.5 (4) 



branched off from the state of rest (zero equilibrium) as result of monotonic 
instability. With increasing the Rayleigh parameter Ai the family deforms, and 
then on it the arches of unstable equilibria occur. In Fig. 1 the families for the 
case of two-component fluid and the container with b = 2 are given as the 
projections on a plane Nuh and Nuy\ 

Nuh = j el(^,y,t)dy, Nuy = j 6l{x,Q,t) dx. 

At the critical value \u two unstable points appear on the family. For 
instance, for A 2 = —5 we found this instability at A^^ = 138.1 (circles A and 
E on Fig. 1 at the left), and for A 2 = 5 - respectively at Xu = 95.5 (asterisk 
on Fig. 1 on the right). We present in Figs. 2 and 3 the streamlines, isotherms 
and isolines for concentration corresponding the letters on Fig. 1. 

At large negative gradient of concentration and small diffusive coefficients 
the state of rest loses stability by oscillatory manner. We have observed here 
a new scenario of convective regimes development. Convective transitions are 
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III 



Fig. 2. Stream functions (I), temperature (II), concentration (III) for some regimes, 
5 = 2, 6 = 2, a = 1, Ai = 72, A 2 - -5 








organized using nonstationary regime branched off from zero equilibrium (state 
of rest), and also two families of the stationary solutions were born ’from 
air’. The given scenario is realized for the cases of two-component and three- 
component fluids. We draw the development of convective regimes for two- 
component fluid in Fig. 4; the parameters are the following A 2 = — 10, /^2 = 0.3 
and 6 = 2. 

The state of rest is stable up to A 1 ^77 and simultaneously there exist 
two families of steady states originated via ’out of thin air’ bifurcation [12]. 
Each of these families consists of stable and unstable arches (here we mean 
transversal stability or instability with respect to a family). State of rest loses 
its stability via Poincare- Andronov-Hopf bifurcation (oscillatory instability) 
and the stable limit cycle is formed (curve 3 in Fig. 4). Increasing Ai both 
families become more complicated and at Ai 79.4 collide one with another. 
Then two new families are appeared: wholly unstable (curve 4) and partially 
stable (curve 5). Further increasing of Ai leads to the reduction of unstable 
arches and their disappearance. After that, on the interval 80 < Ai < 156.5 
given family is wholly stable. Limit cycle on small interval of Ai undergoes 
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a sequence of bifurcations leading to the chaotic regime. The last collides with 
the unstable family (curve 4) and disappears. This family reduces and vanishes 
at Ai = 85 in the result of collision with the state of rest. It corresponds to the 
transition of two eigenvalues from left half plane to right one along real axis. 
Then we observe the growing of main family and an appearance of four arches 
of unstable equilibria as a result of oscillatory instability at Ai = 156.6. 



5 Conclusion 

The combined spectral and finite- difference approach gives us an opportunity 
to compute continuous families steady states in the planar problem of con- 
vective filtration of incompressible multi- component fiuid saturated a porous 
medium. We study the scenario of transformation of convective regimes and 
paricularly the onset of instability on the family of steady states with variative 
spectrum. 
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Summary. The quasi-static evolution of an elastoplastic body with a multi-surface 
constitutive law of linear kinematic hardening type allows the modeling of curved 
stress-strain relations. It generalises classical small-strain elastoplasticity from one 
to various plastic phases. Firstly, we briefiy recall a mathematical model represented 
by an initial-boundary value problem in the form a variational inequality. Then, the 
main concern of this paper is focused on an efficient numerical implementation of 
a one time-step problem. Based on the minimisation problem we describe an iterative 
non-linear algorithm whose linear subsystems are solved by a geometrical multigrid 
method. Finally, the numerical computations in 2D and 3D are presented. 



1 Introduction 

In this paper we consider the quasi-static initial-boundary value problem for 
small strain elastoplasticity with a multi-surface constitutive law. We treat 
here a Prandtl-Ishlinskii model of a play type which goes back in the ID 
case to Prandtl [Pra28] and ISHLINSKII [Ish54] and in the multidimensional 
case to Besseling [Bes58] and IWAN [Iwa66]. The model extends the clas- 
sical linear kinematic harding model (single-yield model), that goes back to 
Melan [Mel38] and Prager [Pra49] in the sense, that is operates with more 
plastic strains (multi- yield model). Hysteresis properties have been intensively 
studied by ViSINTIN [Vis94] or KREJcf [Kre96] amongst others. Our functional 
formulation of the model and its analysis is based on a direct extention of the 
work of Han and Reddy [HR99] for the linear kinematic hardening model in 
terms of a time dependent variational inequality. Our numerical approximation 
for one time-step problem uses the formulation of Alberty, Carstensen, 
AND Zarrabi [ACZ99] extended for a two- yield model, where the solution pa- 
rameters, i.e., the displacement and two plastic strains, are sought as minimis- 
ers of a convex but non-smooth functional. For our approach we regularise this 
functional, thus standard methods can be applied to the quadratic optimisation 
problem. The main idea for the algorithm is the use of the Schur-Complement 
form of the discretised problem in the displacements. The arising linear system 
is solved by a multi-grid preconditioned conjugate gradient solver. 




540 



J. Kienesberger , J. Valdman 




Fig. 1. Prandtl-Ishlinskii model of play type (left) and its cr — hysteresis type 
behaviour for a periodical stress cr{t) = Asm(t),t E (0, 27t) (right) 



The paper is organised as follows: In Section 2, the local material model 
is presented, which is the basis for the boundary value problem in Section 3. 
The numerical algorithm is designed in Section 4, the numerical experiments 
are presented in Section 5. Finally, an outlook on the work still to do is given. 



2 The Local Material Model 

The constitutive law furnishes the relationship between the stress tensor a and 
the strain tensor e. The model discussed here is the Prandtl-Ishlinskii model 
of play type described by VisINTIN [Vis94] and Krejci [Kre96] among others. 
It contains finitely many surfaces and its rheological structure and typical 
hysteresis behaviour are depicted in Figure 1. It is local in the sense that for 
any given material point x it involves only the time histories a = cr(t) and 
e — e{t) at that point. It is given by the following system of equations and an 
evolution variational inequality: 

e = e+p 

P=J2Pr 

rEl 

<7 = cr^ + aP, r € I 
a = Ce 

(T^ = MrPr, r £ I 

(jP £ Z, Pr '■ {jr — aP) < 0 for all Tr £ Zr,r £ I, 



( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 
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Equation (1) represents the additive decomposition of the strain £ into its 
elastic part e and its plastic part p as well as of the stress a into the backstresses 
and the plastic stress (7^, where = The plastic strain p is 

additively decomposed to internal plastic strains pr. The equation (2) denotes 
a linear elastic law, in the isotropic case one has 

Ce = 2/i£ -f A(tr £:)I, (5) 

where the (positive) coefficients /i and A are called Lame coefficients. Here I 
denotes the second order identity tensor (an identity matrix) and tr : ^ 

E defines the trace of a matrix, tie := Ylj=i ^ ^ E^^^, where d is the 

problem dimension. Equation (3) couples the backstresses and the plastic 
strains Pr through linear mappings with positive definite hardening matrices 
Hr, r G /. A typical choice will be Hr = hri, where hr > 0, r G / are hardening 
coefficients. Variational inequality (4) formalises the Prandtl-Reufi normality 
law, also called the principle of maximal dissipation. The sets Zr C Ef^^, r G / 
describe the admissible (plastic) stresses, their boundaries dZr are called the 
yield surfaces. We will exclusively use the standard von Mises cylinder with 
yield stress 

Zr = {cr G : ||dev(7|| < o-^}. (6) 

Here, ||a|p = a : a, a : b — j=i defines the (Frobenius) norm 

and the corresponding scalar product, and the deviator of a is defined as 
devcr a — ^(tr cr)I. Since this model is described by more (namely M) yield 
stresses cr^, we classify the model as a multi- yield model or as M-yield model 
in order to express the number of yield stresses. If M = 1 then we speak about 
a single-yield model, which represents a classical linear kinematic hardening 
model. 



3 The Boundary Value Problem 

The elastoplastic continuum is assumed to occupy a bounded domain C E^, 
with a Lipschitz boundary F = dQ. The boundary F is split into a Dirichlet 
boundary Fd, a closed subset of F with a positive surface measure, and the 
remaining (relatively open and possibly empty) Neumann part Fjv •= F\Fd. 
We pose essential and static boundary conditions, namely 

u = 0 on Fd and a • n = g on Fjv , 

where p is a given applied surface force and n denotes the outer normal to the 
boundary Fjv- Our analysis will be restricted to the study of a boundary value 
problem defined in these functional spaces: 

= {v € H\f2Y\v = 0 on Fd}, 

Q = {q-.qe devM^^^, q-y € 
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where and are the usual Sobolev and Lebesgue spaces. The con- 

dition q e in the definition of Q implies that tr g = 0, i.e., q is a trace 

free matrix. It is shown by Brokate, Carstensen, Valdman [BCV03] that 
the combination of the system (l)-(4) describing the Prandtl-Ishlinskii model 
of the play type together with (quasi-static) equilibrium between external (de- 
noted as /) and internal forces, i.e., 

div a{x, t) -f /(x, t) = 0, X e £ (0, T) (7) 

results in the time- dependent variational inequality for the state variable w = 

(Ur, (Pr)rG/)* 



a{w{t), z — w{t)) -f — 'ip{w{t)) > {i{t),z — w{t)) , for all z eH. (8) 

w is considered to be an element of the Hilbert space H = H^(i?) x Hre/ ^ 
and to satisfy the zero initial condition w{0) = 0. Writing z = {v, {qr)rei)^ ^ 
bilinear form a(*, •), a linear functional £{■) and a nonlinear functional '0(«) are 
defined as: 



a : H X H R, a{w, z) — C{e{u) — • (^('^) “ Qr) 

o rei rei 



-h / Mpr : qr dx, 

r£l 



( 9 ) 



Q 



^{t) : M, {('{t), ^) — J /(^) • dx + I g{t)-vdS{x), 

O Fn 

tp{z) = 5 / cry\\qr\\dx. 



Thus we can formulate the following formulation of the boundary value prob- 
lem of quasi-static elastoplasticity. 



Problem 1 (BVP of quasi-static multi-surface elastoplasticity). 

For given I e iJ^(0,T; W*) with ^(0) = 0, find w G H^{0,T;H) with u;(0) = 0, 
such that (8) holds for almost all t G (0,T). 

The unique solvability of Problem 1 under the assumption that the elastic 
and hardening tensors are symmetric and positive definite bases on the ex- 
tension on the proof of Han and Reddy [HR95, HR99] and can be found in 
works of Valdman [Val02] or Brokate, Carstensen, Valdman [BCV03]: 



Theorem 1. Let I G jH^(0,T;W* ) with £{0) — 0. Then there exists a unique 
solution w G iJ^(0,T; W) of Problem 1. 
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4 Numerical Algorithm 



The starting point for the finite element method is the time-discretised form of 
the variational problem. Problem 1 is solved by an implicit time discretisation, 
we use the implicit Euler scheme with equidistant time intervals. 

It was shown by Alberty, Carstensen, and Zarrabi [ACZ99] that 
in the single- yield case, i.e., M — the time-discretised dual formulation in 
each time step is equivalent to an optimisation problem depending only on 
the displacement u and the plastic strain p. This result was obtained by us- 
ing functional analytic arguments, as the variational inequality is regarded as 
a sub-differential, for which the dual sub-differential exists and can be refor- 
mulated. The resulting objective depends on the chosen hardening law (linear 
kinematic hardening or isotropic hardening), though the structure remains the 
same. In KlENESBERGER [Kie03] an algorithm solving the single-yield prob- 
lems was developed using the results and notation of Alberty, Carstensen, 
AND Zarrabi [ACZ99]. Since the multi-yield hardening model structurally 
generalises the linear kinematic hardening model, authors managed to extend 
the original code using templates in C-f+ effectively in the way, that the 
multi-yield hardening model becomes a new hardening model. For computa- 
tional reasons new parameters are introduced, which are internal harden- 
ing parameters of the the same dimension as the plastic strains pr and are 
defined by == MrPr- 

The notation is as follows: For given variables with index 0 of an initial time 
step the upgrades of the variables at the time step + At have to be 

determined. The already time-discretised generalised optimisation problem for 
the multi- yield case in each time step, subject to the modifications for fitting 
to the single- yield algorithm, reads as: 



f{u,pi, . . . ,pm) '-=\ j <C(£(w) - ^Pr) : (£(m) - 

n .€/ rei 

+ l [ + ^ • (P^~Pr) 

n n i -e/ 

+ j '^a^lpr - Pr\dx - j fudx ^ min, 



dx 



(12) 



where is the internal hardening variable from the initial time step. 

The basic idea idea for solving the quasi-static problem is using a uniform 
time discretisation and iterate in each time step until the minimisers, i.e., the 
displacement u and the plastic strains pr are determined. Then these values 
and the separately calculated ar are used as the reference values with index 0 
for the next time step 

The fifth term in (12) contains a norm the sharp bend of which may cause 
trouble, as the function / is not differentiable. To apply standard methods. 
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the objective is desired to be differentiable and quadratic, thus the function 
is regularised as follows: The term |.| is regularised by smoothing the norm 
function, i.e.. 






if 

if 



>e, 

< 6 . 



(13) 



For small e, the quadratised function f{u,pi , . . . ^Pm) is very similar to the 
original one, but its properties change enormously. Therefore, it will be referred 
to by the new symbol /. 

Another simplification is defining the change of pr by pr — Pr — Pr^ and 
using it as an argument of the objective instead of pr'. 

The spatial discretisation is carried out by the standard finite element 
method using linear triangular, resp. tetrahedral finite elements. For reasons 
of better readability and coherence, the name of the vector denoting the dis- 
cretised displacement u is again u. The same is valid for pr^ p% furthermore 
the symmetric matrices are transformed to vectors, e.g. in 2D 



( Pr^ Pr^ 

[pr^ pf 



pV 

p? 

Pr^ 



such that the objective and other equations can be written in a matrix and 
vector notation. 

For the derivation of the algorithm and numerical experiments we will 
consider only the two- yield case, i.e., M == 2, as it shows the characteristics of 
the multi-yield problem and can be extended easily. Now, the objective reads 
as 



^ /w\^ /B^CB -B'^C -B^C\ fu\ 
f(u,puP2) = ^\Pi] \ -CB C + V^ C 

^ Vp2/ V-CB C C + Vy \p2j 

f-f - + p^)\ ^ / u\ (14) 

+ €(p?+p^)+Qa? Pi 

V C(p? + p^) + / \P2/ 

+ : p? + icp^ : p^ + ^ min, 

where Bu denotes the discretised strain s{u)^ and Q is the result of regarding 
Pr as vectors, i.e., the matrix norm is defined by \p\ = (p^Qp) 2 . 

pi izr Q(1 -h 1 ^^) is the non-linear iteration matrix of / with respect 
to pi, and analogous for and p 2 . These matrices are computed in every 
iteration step using the current pr, but apart from that the dependencies on 
\pr\e will be neglected. This is not an exact method for determining the change 
of the plastic strain, but its error will be corrected later on as the pr will be 
calculated separately and iteratively with the alternating direction method. 
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Fig. 2. Geometry of a beam (left) and the quarter of a ring (right) problems 



The matrix in (14) is positive definite, thus the minimiser {u,pi^p 2 ) has to 
fulfill the necessary condition of the derivative being equal to zero: 



/B'^CB -B'^C -B'^C\ fu\ /-/ - + p^)\ 

I — CB C + C I 1 I I ^(Pi P 2 ) I ~ 

[-CB C C + D2/ \p2/ V C(p? + pO) + Qa° / 



Extracting the vector (pi,P 2 )^ from the two lower lines in (15) and inserting 
it into the first one yields the Schur- Complement system in u: 






/C(p? + p^) + Qa?y 
yC(Pi +p°) + Q«2y ' 
(16) 



This linear system is solved by a multigrid preconditioned conjugate gradient 
method, see e.g. BRAMBLE [Bra95]. From the numerical tests we have seen 
that it is not necessary to use the multigrid preconditioner arising from the 
plasticity problem, the preconditioner for the related problem of elasticity is 
sufficient and much faster. 

For the multigrid method, we use one Gauss-Seidel pre- and post-smoothing 
step in a V-Cycle, the system on the coarse grid is solved exactly. Furthermore, 
the nested iteration approach was used, which means that the starting values 
for the coarse grid correction are the restrictions of the fine grid functions. 



5 Numerical Experiments 

The algorithm was implemented in NG Solve - the finite element solver exten- 
sion package of the mesh generator tool NETGEN developed in our group. 
Finite element basis functions were chosen as piecewise linear for the displace- 
ment u and piecewise constant for the plastic strains pi and p2- Furthermore, 
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Fig. 3. Plasticity domains in the single-yield (left) and two-yield case (right) of the 
beam 



the full multigrid method was used, i.e., we started with a coarse grid, solved 
the problem, refined the grid, solved the problem on the finer grid et cetera. 

The algorithm was tested on two- as well as on three-dimensional domains, 
for both the single-yield and multi-yield case, see Figure 2 for the geometries. 

The first testing geometry is the 2D beam of Figure 2 with the left edge 
fixed and the right edge charged with a force acting in the direction of the 
external normal vector. The second geometry tested is the 3D quarter of a 
ring from Figure 2 with constant thickness in the z-axis which is the same as 
the thickness of the ring in the 2D sketch. The quarter ring is fixed on the lower 
face and a force is acting upwards on the right face. The finest uniform mesh 
consists of 131 072 triangles (which corresponds to 658 428 degrees of freedom 
DOF in the calculation of u) for the 2D examples and 25 088 tetrahedra (122 
334 DOF) for the 3D example. Figures 3 and 4 show the plasticity domains 
in the single-yield and in the multi-yield case. The elastic zones are colored- 
light grey, the first plastic zones are middle-grey, and the second plastic zone 
is dark-grey. 
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kC Netgan 4.3 

Fig. 4. Plasticity domains in the single-yield (left) and two-yield case (right) of the 
quarter of the ring 

6 Conclusions and Future Work 

In this paper a multi-yield plasticity model and its numerical computations 
were shown. The nonlinear iterative algorithm that uses a multigrid precondi- 
tioned solver was presented, its performance in 2D and 3D was demonstrated. 

In the future we will extend the solution idea to a quasi-Newton algorithm, 
i.e., the Schur- Complement matrix will have some Hessian- type entries in order 
to improve the computational performance. This idea is already implemented 
for the single-yield case, where the numerical results demonstrate the faster 
algorithm performance with linear complexity. We expect the same result for 
the multi- yield case. 

Another long-term aim is to identify the interfaces between the elastic and 
plastic zones and to refine the mesh adaptively in such a way, that the interface 
is approximated by the mesh. Then we expect an even faster performance of 
the algorithm. 
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Summary. It has been over fifty years since David M. Young’s original work on the 
successive overrelaxation (SOR) methods. This fundamental method now appears 
in all textbooks containing an introductory discussion of iterative solution meth- 
ods. (Most often the SOR method appears after a presentation of Jacobi iteration 
and Gauss-Seidel iteration and before the conjugate gradient iterative method.) We 
present a brief survey of some of the research of Professor David M. Young, together 
with his students and collaborators, on iterative methods for solving large sparse lin- 
ear algebraic equations. This is not a complete survey but just a sampling of various 
papers with a focus on some of these publications. 

Dr. David M. Young’s doctoral thesis [27] was accepted in 1950 by his supervis- 
ing Professor Garrett Birkhoff of Harvard University and his paper [28] based this 
work appeared in 1954. This is one of the landmark contributions in modern numer- 
ical analysis. The red-black ordering for matrices is of great importance in parallel 
computing. Gene Golub has said: “It’s almost as if David could see into the future!” 

David Young celebrated his 80th birthday on October 20, 2003 
(http : //www . ma . utexas . edu/CNA/photos . html) . 



1 Introduction 

We present a brief survey of some of the work of Professor David M. Young, 
together with his students and collaborators, on iterative methods. Dr. David 
M. Young has been involved in research on iterative methods for solving large 
sparse linear algebraic equations for over forty years until his recent retirement. 
This is not a complete survey but just a sampling of various projects with 
a focus on some of his publications. 



2 Successive Overrelaxation 

From research first done at Harvard University, Young presented in his Ph.D. 
thesis [27], and in a subsequent paper [28], an analysis of the successive overre- 
laxation (SOR) method for the case where the coefficient matrix of the linear 
algebraic system Au = b is consistently ordered [30]. An elliptic partial dif- 
ferential equations over a region with grid points numbered in the natural 
ordering (left-to-right and up) and using the standard five-point discretization 
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stencil results in such a matrix system. In fact, any matrix system derived in 
this way has Young’s Property A [30]. Moreover, a consistently ordered system 
can be obtained from one with Property A after a suitable permutation. 

For a matrix with Property A, one can permute the rows and corresponding 
columns to obtain a red-black system. The red-black ordering corresponds to a 
red and black checkerboard ordering of the grid points. When A is a red-black 
matrix, it is consistently ordered and Young’s equation [27] (A + cj — 1)^ = 
gives a relation between the eigenvalues A of the iteration matrix 
for the SOR method and the eigenvalues p of the iteration matrix B for the 
Jacobi method. If A is symmetric positive definite, then the eigenvalues of B 
are real and less than 1 in absolute value and the optimum or best value of 
the acceleration factor u is given by = 2/ -h y/l — S(B)‘^ ^ . Here S{B) 
is the spectral radius of the Jacobi matrix B, which is the magnitude of the 
eigenvalue of largest absolute value of the matrix B. Moreover, the spectral 
radius of the SOR matrix with the optimum relaxation parameter co = cuiy is 
given by S{jC^^) = I =r. 

For model problems involving the Poisson equation over a region with mesh 
points of grid size /i, it can be shown that the number of iterations required 
for convergence of the SOR method is n == whereas the number of 

iterations is n = using either the Jacobi or Gauss-Seidel methods. In 

this situation, the SOR method is faster by an order of magnitude. 

Work has been done on the choice of the optimum cj for the case where A is 
consistently ordered but not symmetric positive definite and where some of the 
eigenvalues of the Jacobi iteration matrix B are complex eigenvalues. Several 
programs are available for choosing the optimum cj if all of the eigenvalues of 
B are known or if one knows a convex region containing them. See Young and 
Eidson [35] and Young and Huang [39] . 

An extension of the SOR method is the modified SOR (MSOR) method 
for a linear system with a red-black coefficient matrix. The MSOR method 
involves the use of relaxation factors 002 ^^ 2 ^ • • •, where uJi is used for 

the red components, and is used for the black points, for each i. In Young, 
Wheeler, and Downing [34], it is shown that there are suitable values of uJi 
and Ui)[ that are as good, though not better than, the choice Ui — = uoh for 

all i. On the other hand, other choices are more effective if one measures the 
effectiveness in terms of certain norms as shown in Young and Kincaid [42]. 
Chapters 8 and 10 of Young [30] cover the modified SOR method with fixed 
and variable parameters, respectively. 

A number of other modifications and extensions have been made to the 
SOR theory. For instance in group or block methods, the unknowns are 
grouped into blocks and all values within a block are updated simultaneously. 
Usually, each inner iteration of a block method is done by a direct method 
since the matrices for the blocks are assumed to be easily solvable. For exam- 
ple, these matrices are tridiagonal in the case of the line SOR method when the 
five-point finite difference stencil is used. Also, faster convergence is obtained 
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for line SOR methods than for point SOR methods, in general. Moreover, the 
general SOR theory has been applied to group iterative methods in Chapter 
14 of Young [30]. 

Research on norms associated with the SOR method for the red-black 
system has resulted in new formulas. It has been shown that graph of the 
D 2 -norm function for the SOR matrix is not monotonically decreasing 
(it increases and then decreases), but the A ^ -norm is indeed a monotonically 
decreasing function of m; however, it is still considerably larger than the spec- 
tral radius function = r^. See Young and Kincaid [42] and Chapter 7 

in Young [30]. 

Corresponding to the SOR method is the Symmetric Successive Overre- 
laxation (SSOR) method in which an iteration consists of one iteration of 
the (forward) SOR method followed by one iteration of the (backward) SOR 
method. In the Unsymmetric Successive Overrelaxation (USSOR) method, dif- 
ferent parameters may be used in the red and black equations, respectively. 
Young [29] presents convergence properties of the symmetric and unsymmetric 
successive overrelaxation methods and related methods. 



3 Chebyshev Acceleration 

The SOR method can be regarded as a way to accelerate the convergence of the 
Jacobi method in a certain sense. Another way of speeding up the convergence 
of the Jacobi method is to use an extrapolation method or a Chebyshev accel- 
eration method, which is based on Chebyshev polynomials. These are general 
procedures and they can be applied to methods other than just the Jacobi 
method as shown in Hageman and Young [7] . 

Suppose the basic iterative method to be used in the acceleration procedure 
has the form -i-k where the eigenvalues /a of G are bounded such 

that m(G) < /a < M(G), Here m(G) and M{G) are the smallest and largest 
eigenvalues of G, respectively. Using the three-term relation for Chebyshev 
polynomials, the optimal Chebyshev acceleration method can be written as 
^(n+l) ^ -h /c) -h (1 - + (1 - where Pn+l = 

(1 - (cr/2)^Pn), (with Pi = 1 and p 2 = (1 - cr^/2)), a = [M{G) - m{G)]/[2 - 
M{G) - m{G)], and 7 - 2/[2 - M{G) - m{G)]. 

Varga [26] refers to this procedure as the Chebyshev semi-iterative method. 
One needs to choose estimates for M{G) and m{G), which may cause difficul- 
ties in some cases. In fact, the behavior of the acceleration procedure is often 
sensitive to these estimates and especially the one for M{G). It can be shown 
that the optimum Chebyshev acceleration procedure is an order of magnitude 
faster than the optimum extrapolated procedure for a close to one [7]. 

If one applies the Chebyshev acceleration procedure to the Jacobi method 
as the basic method for solving a linear system with a red-black coefficient 
matrix, then the computation can be simplified. This is done by rewriting 
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the procedure in terms of only the red points or only the black points for 
the Jacobi method. Golub and Varga [6] refer to this as the cyclic Chebyshev 
semi-iterative method. The cyclic acceleration method is the original Cheby- 
shev acceleration method with half of the calculations bypassed. The result- 
ing method is equivalent to a special case of the modified SOR method with 

L0^—L0\ — Pn+1- 

With an adaptive Chebyshev acceleration procedure, one continuously re- 
vises the iteration parameters as the iterative method proceeds. The algorithm 
fixes the smallest eigenvalue estimate rriE < m(G) and adaptively modifies the 
largest eigenvalue estimate Me but keep Me < M{G). The iterative procedure 
continues using these values of m^; and Me until the observed convergence is 
much slower than expected in a certain sense. By solving a Chebyshev equation, 
the algorithm increases Me but keeps Me < M{G). This adaptive Chebyshev 
acceleration procedure is repeated until convergence is achieved according to 
the stopping test being utilized. Chebyshev polynomials are used in the algo- 
rithm for choosing these maximum and minimum eigenvalue estimates. Such 
a procedure was developed by Hageman and Young [7] and it was incorporated 
into the algorithms used in the ITPACK software packages [18]. 

It has been shown that, in some cases, one can obtain almost as good 
a convergence as with Chebyshev acceleration by the use of a stationary second 
degree method given by = p{j(Gu^^^ -\- k) (1 — (1 — 

Here let p = 1 when n = 0. See some of the papers by Young and/or Kincaid 
on second-degree methods [14, 15, 20, 31]. 



4 Conjugate Gradient Acceleration 

Conjugate gradient acceleration -f -h (1 — 

Pn+i)u^^~^^ is similar to Chebyshev acceleration except that the parameters 
used involve inner products: 

= [1 - 

(with pi = 1) and 7n+i = Here — b — is 

the residual vector. As with Chebyshev acceleration method, conjugate gradi- 
ent acceleration method can speed-up the Jacobi method and other methods. 
Conjugate gradient acceleration has some advantages over Chebyshev accel- 
eration [7]. It can be shown that the convergence of conjugate gradient accel- 
eration, measured in a certain norm, is at least as fast as that of Chebyshev 
acceleration. With conjugate gradient acceleration there are no parameter es- 
timates; however, the basic iterative method may involve a parameter as in 
the case when SSOR is used as the basic method. Since the conjugate gradi- 
ent acceleration requires the computation of inner products for each iteration, 
the work required per iteration may be somewhat greater than for Chebyshev 
acceleration. For basic methods that are not symmetrizable, the generalized 
conjugate gradient methods can be used to accelerate their convergence [7]. 
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We can introduce a nonsingular matrix W as follows [W {I—G)W~^] [Wu] = 
Wk which in terms of the original linear system is [WQ~^AW~^][Wu] = 
WQ~^b. The generalized conjugate gradient acceleration method [7] corre- 
sponding to the basic iterative method is given by 

= p„+l{7n+l<5<”^ + + (1 - 



where 



p„+l = [1 - i7n+lhnPn){W5^^\W5^^^)/{W5(^-^\WS^^-^^)]-\ 

(with pi = 1), 

7n+i = [1 - 

and the pseudo-residual vector is = Gu^^^ A k — The conjugate 
gradient acceleration method minimizes the [W'^W{I — G)] 2 -matrix-norm of 
the error as compared with any polynomial acceleration procedure based. If 
A and Q are symmetric positive definite matrices and if W^W = Q, then we 
minimize the A 2 -matrix- norm of the error as in the conjugate gradient method. 
It can be shown that the average rate of convergence for the conjugate gradient 
method, when measured in the [W'^W{I — G)] 2 -matrix- norm, is at least as 
large as that for the corresponding Chebyshev acceleration procedure. (See 
Hageman and Young [7].) As in Young [30], we denote the L-matrix norm by 
\\Q\\l = \\LQL-%. 



5 Nonsymmetric Systems 

A difficult problem is solving the linear system when the coefficient matrix 
A is not necessarily symmetric positive definite or even symmetric. Three 
generalized conjugate gradient acceleration methods called ORTHODIR, OR- 
THOMIN, and ORTHORES were considered by Young and Jea [40]. It was 
shown that under fairly general conditions these methods converge, in exact 
arithmetic, in at most N iterations, where N is the order of the matrix. Also 
in Jea and Young [9], the biconjugate gradient (BCG) method as well as other 
forms of Lanczos methods were considered as generalized conjugate gradient 
acceleration methods corresponding to certain double linear systems involv- 
ing A and A^ . 

The generalized minimum residual (GMRES) method [25] is a widely used 
method for solving nonsymmetric linear systems. The method is generally very 
reliable although stagnation may occur in some cases. Moreover, for nonsym- 
metric systems, the amount of work required per iteration usually increases 
as the number of iterations increases. ‘Recently, Young working with Chen 
and Kincaid developed various generalizations of the GMRES method and 
combined them with the Lanczos procedure. New iterative methods called 
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GGMRES, MGMRES, and LAN/MGMRES have been established [2, 3, 4, 5]. 
The GGMRES method is a slight generalization of the GMRES method. The 
GMRES method minimizes a norm of the residual and GGMRES minimizes 
a more general norm — both involve a minimization condition. Alternatively, 
the MGMRES method is a modification of the GMRES method applied to 
a symmetric indefinite linear system using a Galerkin condition. This latter 
method is related to the BCG method and to other variants of the Lanczos 
method. The LAN/GMRES method aims at combining the reliability of the 
GMRES method with the reduced work of a Lanczos- type method. When con- 
ducting initial numerical experiments on nonsymmetric linear systems arising 
from convection-diffusion problems, it was found that LAN/MGMRES was 
comparable with a number of other methods that are extensively used. 



6 Software for Iterative Methods 

Under the direction of Kincaid and Young, several research-oriented software 
packages were written as part of the ITPACK Project at the Center for Nu- 
merical Analysis. Beginning in the mid-1970s, there was an increased effort 
to develop iterative algorithms and portable public domain software. Software 
packages, such as the ITPACK 2C package, were developed, which included 
automatic procedures for handling choices that were causing difficulties for 
users of iterative methods. Automatic procedures included were developed for 
determining all necessary iteration parameters and for accurate and realistic 
stopping tests for iterative algorithms. Algorithms based on these procedures 
were described in the book by Hageman and Young [7]. Also, software from 
the ITPACK 2C package was modified and incorporated into the ELLPACK 
package at Purdue University for solving elliptic partial differential equations. 
(See Rice and Boisvert [24].) 

While the ITPACK 2C package was intended primarily for linear systems 
where the coefficient matrix is symmetric positive definite or nearly so, other 
packages such as NSPCG [23] and PCG [10] were developed with the capability 
of handling nonsymmetric systems. Other ITPACK Project software include 
ITPACKV 2D [17], ITPACK 3 A [45] and ITAPCK 3B [22], for example. See 
Kincaid and Young [19] for a review of the ITPACK software packages. 



7 Alternating- Type Iterative Methods 

To construct an alternating-type method for solving Au = 5, we choose matri- 
ces i7, V, and U such that A = i7 + U -h T, where T is a diagonal matrix with 
positive diagonal elements. For any linear system of the form (H -h p^)v = w 
or {V pE)v = w, we assume that H + pE and V + pE are nonsingular matri- 
ces for any positive real number p and so that for any vector w one can easily 
solve for v. To define an alternating-type iterative method, we choose positive 
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numbers p and p' and, for a given we determine and by 

{H+pS)u^^+ 2 ) = b-(V-pE)u<-^^ and {V + p' E)u^^+^'> = b-{H-p'E)u(^+^\ 
Thus, we have = Tp + kp p. where Tp p, = (V + p'E)~'^{H - 

p'E){H + pE)-\V -pE) = I-{p + p'){y + p'E)-'^E{H + pE)~\H + V) 
and kp^p! = [p + p'){y + p' E)~^ E{H + pE)~^b = {I — Tp^p>)A~^b. Examples of 
alternating-type methods are the alternating-direction implicit (ADI) method, 
the symmetric successive overrelaxation (SSOR) method, and the unsymmet- 
ric successive overrelaxation (USSOR) method. With the ADI method, H and 
V are either tridiagonal or are permutationally similar to tridiagonal matrices 
and E = I. With the SSOR and USSOR methods, H and V are lower trian- 
gular and upper triangular matrices, respectively, and Z* is a diagonal matrix 
with positive diagonal elements. 

In certain cases, the ADI method converges rapidly. For example, with 
problems involving Poisson’s equation in the rectangle with a grid of mesh 
size h, the ADI method convergences in n = 0{logh~^) iterations using the 
optimum number of parameters and in n = iterations using the best 

m parameters. Recall that n = 0{h~^) for the SOR method. The commutative 
case is when HV = VH,HE — EH, VE = EV. It holds for certain separable 
self-adjoint elliptic partial differential equations defined over rectangles. Given 
the commutativity condition and also bounds on the eigenvalues of H and 
V, necessary and sufficient convergence conditions related to choosing ADI 
parameters can be found in Birkhoff, Varga, and Young [1] and in Chapter 17 
of Young [30] . Also see Young and Wheeler [48] . 

With a nonstationary alternating-type iterative method, the parameters p 
and p' may vary from iteration to iteration. We seek to determine the parame- 
ters {pi\ and {p[} so that is as close to the true solution u = A~^b as pos- 
sible. In practice, we seek to make the spectral radius S small 

as possible. As an alternative to the (sequential) non-stationary method, we 
consider the parallel alternating-type iterative method. See papers by Young 
and Kincaid [43, 44]. 



8 Books 

The classical research monograph Iterative Solution of Large Linear Sys- 
tems [30] by Professor Young was published in 1971 and reprinted in 2003. 
Also, the book Applied Iterative Methods [7] by David M. Young in collabo- 
ration with Louis A. Hageman appeared in 1981 and was reprinted in 2004. 
Also, we should mention A Survey of Numerical Mathematics, in two volumes 
(1972 and 1973), written by David M. Young and Robert T. Gregory [36, 37]. 
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Summary. A simplified relational database model implementation for parallel nu- 
merical computation is presented. The efficiency of this data model is demonstrated 
through the example of an optimally parallelized if-matrix (hierarchical-matrix) 
multiplier and its application as 3- dimensional Laplace preconditioning matrix. 



1 Introduction 

The numerical algorithms have become quite complex and require dynamic 
data structures. The general practice is to use a mixed data structure for this 
purpose [5]. In this paper a simplified version of the well-known relational 
database model is presented, which is developed for parallel numerical compu- 
tation. This database model was introduced by E. F. Codd [3, 6]. The great 
advantage of this model is the flexible, simple and homogenous data structure. 
Moreover, the application of special geometric search trees (oct-tree, ADT, 
etc.) [4, 6, 13] can be avoided in such way. The simplified database model is 
implemented in object oriented manner by using the programming language 
C4-+. 

The non-parallel version of this simplified relational database model for 
non-structural mesh generation is investigated in [11]. This paper demon- 
strates the efficiency of the parallel version in the case of an optimally paral- 
lelized iJ-matrix (hierarchical-matrix) multiplier. The applied i7-matrix tech- 
nique is originated from W. Hackbush and B. N. Koromskij [8, 9, 2]. A great 
advantage of the presented parallel iJ-matrix multiplier is that its implemen- 
tation is independent of the geometrical complexity of the domains. 

Our paper is organized as follows. Section 2 contains the description of 
the simplified relational database model. The parallelized JT-matrix multiplier 
is discussed in Section 3. Section 4 is devoted to its application as Laplace 
preconditioning matrix on a very complex 3- dimensional domain. 

* This work was partly supported by the Grants T043258 and T042826 of the Hun- 
garian National Research Found. 
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2 Simplified Relational Database Model 



In the case of the classical relational database model the data files are con- 
sidered as tables, where the number of columns is fixed and the number of 
rows may vary. Each column contains elementary data from the same data 
type and can be referred to by a symbolic name. These symbolic names are 
called field identifiers. Each row (record) is identified by its row (record) num- 
ber and a part of a row (record) by its field identifier. 

The numerical problems usually require only one ordering on one data 
table. Hence it is worth to use combined data and index tables here to store the 
records in the adequate key order. This idea is implemented as a RelDataBase 
class with the following member functions. 



Table 1. Member Functions of the Class RelDataBase 



Member Function 


Description 


Create (int Table Id, int RecSize, 
KeyDesc* pKeyDesc); 


Creates a data table. 


Erase (int Tableld); 


Erases a data table. 


Insert (int Tableld, char* pRecord); 


Inserts a record. 


Update(int Tableld, char* pRecord); 


Updates a record. 


Delete (int Tableld, char* pRecord); 


Deletes a record. 


DeleteAlKint Tableld, char* pRecord); 


Deletes all record. 


GetFirst(int Tableld, char* pRecord); 


Gets the first. 


GetNextdnt Tableld, char* pRecord); 


Gets the next. 


GetLast(int Tableld, char* pRecord); 


Gets the last. 


GetPriordnt Tableld, char* pRecord); 


Gets the previous. 


GetCurrent (int Tableld, char* pRecord); 


Gets the current. 


SeekLessEqdnt Tableld, char* pRecord); 


Seeks for less equal. 


SeekGreatEqdnt Tableld, char* pRecord); 


Seeks for greater equal. 



For the sake of flexibility the records are called by reference, and the record 
pointers are casted to character type. The third parameter KeyDesc of the 
member function Create is a key descriptor structure for key segment identi- 
fication (data type, beginning position). 



struct KeyDesc 

{ 

char KeyFieldType [KEY_SEGMENT_MAX] ; 
char KeyFieldBegPos [KEY_SEGMENT_MAX] ; 

}; 



The integrated data and index tables are realized by the well-known 
Red— Black and B balanced tree structures [4, 6]. Both tree structures are 
asymptotically optimal and widely used. The cost of the tree operations In- 
sert, Delete, Update and Seek require less than or equal to 0(log(n)) arith- 
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metic operations, where n denotes the number of tree-nodes. However, the cost 
of the sequential read in index order is only 0(1) in the practice. 

We have applied the distributed database technique, using local simpli- 
fied relational database on each process together with the message-passing 
standard MPI [7]. 



3 Parallel i?-Matrix Multiplier 

Let i? C be a given domain and let it be endowed with a regular, quasi- 
uniform tetrahedral mesh. Furthermore let Q = be a suitable 

chosen rectangular box which contains i7. The coordinates of the minimal and 
maximal nodes of Q are denoted by dRectMin and dRectMcix, respectively. 
We consider the H -matrix partitioning of this fictitious domain Q. Two main 
levels of the partitioning are introduced. The first one, nLevelSubDom is 
the level of subdomains. This level determines the size of greatest i7-matrix 
blocks. The second one, nLevelMesh, is the level of the quasi-uniform mesh 
on i7. 



3.1 IT— Matrix Partitioning 

We have applied the next FT-matrix partitioning algorithm. 



// 

void HMtxGen (int nLevelSubDom, int nLevelMesh, 

double dRectMin [3] , double dRectMax[3]) 

{ int nLV, nLevel, nRowBlock[3] , nColBlock[3] ; 
nLevel = 0; 

for (nLV = 0; nLV < 3; nLV++) { nRowBlock[nLV] = 0; nColBlockCnLV] = 0;} 

HSubMtxGen (nLevelSubDom , nLevelMesh , dRectMin , dRectMax , 
nLevel, nRowBlock,nColBlock) ; 

} 

// 

void HSubMtxGen (int nLevelSubDom, int nLevelMesh, 
double dRectMin [3] , double dRectMax [3], 
int nLevel, int nRowBlock[3] , int nColBlock[3] ) 

{int nLVi,nLVj ,nLVk,nNextLevel ,nRowSubBlocks [8] [3] ,nColSubBlocks [8] [3] ; 

if ( nLevel == nLevelMesh) 

HMtxBlockWr ite ( nLevelSubDom , nLevelMesh , dRectMin , dRectMax , 
nLevel ,nRowBlock,nColBlock) ; 

else 

{ nNextLevel = nLevel+1; 

for(nLVi = 0; nLVi < 2; nLVi++) 
for (nLV j = 0; nLVj < 2; nLVj++) 
for(nLVk = 0; nLVk < 2; nLVk++) 

{ nRowSubBlocks [4*nLVi+2*nLVj+nLVk] [0]=2*nRowBlock[0]+nLVi; 
nRowSubBlocks [4*nLVi+2*nLVj+nLVk] [l]=2*nRowBlock[l]+nLVj ; 
nRowSubBlocks [4*nLVi+2*nLVj+nLVk] [2] =2*nRowBlock [2] +nLVk; 



nColSubBlocks [4*nLVi+2*nLVj+nLVk] [0]=2*nColBlock[0]+nLVi; 
nColSubBlocks [4*nLVi+2*nLVj+nLVk] [1] =2*nColBlock [1] +nLVj ; 
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nColSubBlocks [4*nLVi+2*nLVj+nLVk] [2]=2*nColBlock[2]+nLVk; } 

forCnLVi = 0; nLVi < 8; nLVi++) 

{ forCnLVj = 0; nLVj < 8; nLVj++) 

{ if ( (nLevel < nLeveSubDom) | | 

(dBlockCentrDistance (nRowSubB locks [nLVi] , 
nColSubBlocks [nLVj] ) <= (sqrt(3.01)/2.0))) 

HSubMt xGen (nLevelSubDom , nLevelMesh , dRectMin , dRectMax , 
nNextLevel ,nRowSubBlocks [nLVi] .nColSubBlocks [nLVj] ) ; 

else 

HMtxBlockWrite (nLevelSubDom .nLevelMesh . dRectMin . dRectMax . 
nNextLevel .nRowSubB locks [nLVi] .nColSubBlocks [nLVj] ) ; } } 

} 

} 



3.2 iJ-Matrix- Vector Multiplication 

Two tables are introduced for the matrix-vector computation by iJ-submatri- 
ces. The table THMtx contains the descriptions of the JT-submatrix blocks. 
If the row and column blocks of a submatrix block are not too close to each 
other then this table is used for the submatrix-vector product computation 
as well. In this implicit case the appropriate variables are determined by the 
row and column blocks by using the table TSimp of tetrahedra. A tetrahedron 
belongs to an if-submatrix row or column block if its center of gravity belongs 
to the given block. 

If a block is small and the row and column blocks are close to each other 
then the submatrix block is given explicitly. In this case the submatrix-vector 
product is computed by the application of the table TExplHMtxB locks. 

The record structures of tables THMtx, TExplHMtxB locks and TSimp 
are the next ones. 

H 

// Table : THMtx 

// Content '.Geometric search table for H-submatrix blocks. 

// Key : nCSubDomId+nRSubDomId+nSubMtxType+nSubMtxId 

// +nLevel+nColBlockCoords+nRowBlockCoords 

struct RHMtx 

{ 

int nRowBlockSubDomId; // Subdomain identifiers of the row 
int nColBlockSubDomId; // and column blocks. 

int nNeighbType;// Neighb. type of the subdomains (NBT-IDENT. NBT_NEAR. NBT_FAR) . 

int nSubMtxType; // Submatrix type (BLOCK-EXPL. BLOCK_IMPL) . 

int nExplHMtxBlockId; // Submatrix block id. in the explicit case. 

int nLevel; // Division level. 

int nRowBlockCoords [3] ;// The block coordinates of the 

int nColBlockCoords [3] ;// row and column blocks. 

}: 

// 

// Table : TExplHMtxBlocks 

// Content : Explicitly given H-submatrix blocks. 

// Key : nExplHMtxBlockId+nRVarld+nCVarld 

struct RExplHSubMtx 

{ 

int nExplHMtxBlockId; // Explicit submatrix block identifier, 

int nRVarld; // Row variable identifier, 

int nCVarld; // Column variable identifier, 

double dValue; // Value. 
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u 

// Table : TSimp 

// Content : Geometric search table for terahedra. 

// Key : nSubDomId+nLevel+nBlockCoords+nSimpId 

struct RSimp 

{ 

int nSimpId; // Tetrahedron identifier. 

int nSubDomId; // Subdomain identifier. 

int nLevel; // Fine mesh level. 

int nBlockCoords [3] ; // Search block coordinates. 

int nLocation; // Location (LOC_INSIDE, L0C_B0UNDARY) . 

int nNodeIds[4]; // Node identifiers. 

}; 

The global matrix-vector product y = H ■ x is computed as a sum of lo- 
cal matrix-vector products of the form = HL{i),R{j),c{k) ■ XL{i),c{k)- 

Here stand for the subvector of x determined by the cluster with 

parameters level L{i) and column block coordinate C{k). The vector R(j) 
is defined similarly. Only uniform iiT-matrices will be considered. In this case 
the order of the approximation is the same on each cluster determined by 
JT-submatrix blocks. Then the implicitly given iJ-submatrix blocks can be 
expressed as i^L(i),i?(y),c(/e) = Em=i matrix-vector prod- 

ucts ’^L{i),c{k) are computed as 

M M M / M \ 

y-L(i)Mi) = ■ [X] ■ ^L{i),C{k) I > 

1=1 1=1 1=1 \m=l ) 

where the constant M depends only on the order of the used approximation. 

The distribution of the vectors among the processes is determined by the 
subdomain level iJ-matrix partitioning. The values q = Sm=i '^L{i),c{k) 
are computed on that subdomain which is identified by the iJ-submatrix col- 
umn block. Three types of i-®- of local 

matrix- vector product computations by if-submatrix blocks are introduced. 
The field nNeighbType of the table THMtx serves this purpose. The pos- 
sible values of this field are the following. 

The compact data exchange denotes the exchange of single and grouped 
data together with the values ci . In the grouped case we send only the param- 
eters level L{i) and block coordinate R{j) of the row block cluster. The costs 
of the local and global variants of the compact data exchange are 0(log(^)) 
and 0(log(p) -p), respectively. Here n is the number of unknowns and p is the 
number of processes. 

The computation of the global y = H • x product needs an additional step 
to correct the local values at the boundaries of neighbouring subdo- 

mains. For this purpose a simple local data exchange is applied by using the 
MPI functions MPI_Send and MPIJRecv too. The cost of this correction 
step is 0((p3), 
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Table 2. Possible Values of the Field nNeighbType 



Value 


Description 


NBT_IDENT 


Then the subvectors Xr,.^ and , belong to the same sub- 

domain. 


NBTJJEAR 


Then the subvectors ^L(i) R{j) neighbouring 

subdomains in geometric sence. Here compact local data exchange is 
applied by using the table THCommNear and the MPI functions 
MPI_Send and MPI_Recv. 


NBT_FAR 


Then the subvectors Xj^[i)^c{k) ^ L{i) R{j) non- 

neighbouring subdomains. Here compact global data exchange is ap- 
plied by using the array ArrHCommFar and the MPI function 
MPI_Allreduce. 



Since the computational cost of the values q and ci -ai is 0(log(^) • 

on each subdomain, the total computational cost of the global matrix-vector 
product y = H ' X is O ^log p ^ ’ p + per subdomain. 

The record structures of the table THCommNear and the array Ar- 
rHCommFar are the next ones. 



// 

// Table : THConmiNear 

// Content .-Data exchange among neighbouring subdomains. 

// Key : nSubDomId+nDataType+nVarld+nLevel+nBlockCoords 

struct RHCommNear 

{ 



int 


nSubDomId; 


// Subdomain identifier. 


int 


nDataType; 


// Data type (DT_SINGLE,DT_GROUPED) 


int 


nVarld; 


// Variable id. (single case). 


int 


nLevel ; 


// Division level (grouped case). 


int 


nBlockCoords [3] ; // Block coordinates (grouped case). 


double 


dValues [M] ; 


// Values. 



}: 

H 

// Array : ArrHCommFar 

// Content: The array for data exchange among non-neighbouring 

// subdomains . 

pArrCommFarArr = new double [nProcNum*M] ; 

// The number of processes multiplied by M. 



3.3 JT-Matrix Multiplier Object 

Our if-matrix multiplier object is implemented in the following C -h + class 
structure form. 

U 

class HMTX { 
public: 

HMTXO; 

•' HMTXO; 

int SetlOPuf f (char* plOPuff, int nIOPuf fSize) ; 
int SetCommParamCint nProcId, int nProcNum) ; 
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int SetDataBaseCRelDatciBase *pRDBase) ; 


// 


Rel. database pointer. 


int 


SetGlobalTablesCint tGSubDom, 


// 


Subdomain descriptions. 




int ^tGSubDomVar , 


// 


Subdomain-varibale 




int 


tGVarSubDom, 


// 


connections . 




int 


tGNode , 


// 


Nodes . 




int 


tGSimp, 


// 


Simplices 




int 


tGHMtx , 


// 


H-matrices. 




int 


tGExplHMtxBlocks , // 


Explicit H-matrix blocks. 




int 


tGVectSo, 


// 


Source vector. 




int 


tGVectTa ) ; 


// 


Target vector. 


int 


SetLocalTablesC int 


tLSubDom , 


// 


Subdomain descriptions. 




int 


tLSubDomVar , 


// 


Subdomain-variable connections 




int 


tLHCommNear , 


// 


Near-communication. 




int 


tLNode, 


// 


Nodes . 




int 


tLSimp, 


// 


Simplices. 




int 


tLHMtx, 


// 


H-matrices . 




int 


tLExplHMtxBlocks , // 


Explicit H-matrix blocks. 




int 


tLVectS, 


// 


Auxiliary vector. 




int 


tLVectSo, 


// 


Source vector. 




int 


tLVectTa, ) ; 


// 


Target vector. 


int 


ScatSubDomInfoC void ); 


// 


Scatters tGSubdom. 


int 


ScatSubDomVar ( void 


); 


// 


Scatters tGSubDomVar and 








// 


tGVarSubDom. 


int 


ScatNodeC void ); 




// 


Scatters tGNode. 


int 


ScatSimpC void ); 




// 


Scatters tSimp. 


int 


ScatHMtxC void ); 




// 


Scatters tHMtx. 


int ScatExplHMtxBlocksC void ); 


// 


Scatters tExplHMtxBlocks . 


int 


ScatVect(int tGVect, int tLVect) ; 


// 


Scatters tGVect. 


int 


GathVectCint tGVect, int tLVect); 


// 


Gathers tLVect. 


int 


CompMtxMult (int rLVectTa, int rLVectSo) 





private: 



int CorrVectBlocks(int nProcIdl, int nProcId2) ; 
int SendVectBlockCint nProcIdl, int nProcId2) ; 
int RecvAndAddVectBlockCint nProcIdl, int nProcId2) ; 

int CorrVects(int nProcIdl, int nProcId2) ; 
int SendVect(int nProcIdl, int nProcId2) ; 
int RecvAndAddVect (int nProcIdl, int nProcId2) ; 

}: 



// The member functions 
//of the compact local 
// data exchange. 

// The member functions 
// of the simple local 
// data exchange. 



4 Numerical Results 

Our parallel iif-matrix multiplier implementation has been tested as Laplace 
preconditioning matrix. The enclosed test results regard to quasi-uniform reg- 
ular tetrahedral triangulations of the very complex water domain O of the 
RABA Euro 2 diesel engine. The shape of this domain can be seen in Figure 

1 . 

We have tested the parallel solution of the three-dimensional Poisson equa- 
tion 

—Au — / in i?, u = 0 on dQ 

by using the preconditioned conjugate gradient (PCG) [5, 10, 12] algorithm. 

The Poisson equation has been discretized by the finite element method, 
applying piecewise linear finite elements. The optimal preconditioning matrix 
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Fig. 1. The Test Domain 



P was given in the form P = where the matrix V is the iJ-matrix ap- 
proximation of the bilinear form 



V{u,v) = 



// 



u{x)-v{y) 

I 1 2 



Q Jf2 \x — y\ 



which is an norm representation. It is discretized by piecewise linear 

finite elements on the level of the given tetrahedral mesh, and by piecewise 
bilinear elements on the clusters, which are determined by the iJ-submatrix 
blocks. 



Table 3. Test results 



Nodes Tetrahedra Processes Computers Free, matrix Iterations CPU time in sec. 


20455 100986 


1 


1 


D 


17 


92 




1 


1 


P 


5 


87 




8 


8 


D 


17 


54 




8 


8 


P 


5 


18 




64 


16 


D 


17 


23 




64 


16 


P 


5 


6 


121441 403944 


1 


1 


D 


26 


246 




1 


1 


P 


7 


224 




8 


8 


D 


26 


117 




8 


8 


P 


7 


59 




64 


16 


D 


26 


64 




64 


16 


P 


7 


12 
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Our computer cluster was a 16-machines cluster of IBM PC compatible 
machines with 900MHz Intel processor using the message-passing standard 
MPI under the operation system LINUX. 

The computational results given in Table 3 correspond to the case when 
the right hand side of the Poisson equation is 

f{x,y,z) = sin {xyz) 

and the error bound is ^ = 10“"^. The preconditioning matrix D denotes the 
inverse of the main diagonal of the matrix of the discretized Laplace operator. 

From our results we can conclude that the matrix P is an efficient precon- 
ditioning matrix, but its computation is fast enough in parallel environment 
only. 

Remark 1. The il-matrix approximation of the inverse Laplace-matrix [1] 
would probably give a better convergence results, however, we found that its 
implementation is much more complicated. 
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Summary. We present an equation capable of describing the evolution of implicit 
surfaces (boundaries of gas bubbles) in liquids. The equation does not contain any 
convective term and thus it is independent of the velocity and pressure fields, given 
by e.g. Navier-Stokes equations. The equation itself is obtained by a series of vari- 
ational problems. First, by interpreting the Fuler implicit time- stepping scheme for 
the update of Lagrangian fiow maps as a Monge- Kantorovich transference problem. 
This approach provides then a variational principle for the updates of an Fulerian 
phase Indicatrix in terms of the Wasserstein metric. The Fuler-Lagrange equations 
corresponding to the later variational principle provide the thought after evolution 
equation for the implicit surfaces. 



1 Classical Approach to Tracking of the Implicit Surfaces 

Let us consider a mixture of liquid and gas. We assume that the gas is formed in 
bubbles, and we further require that both phases are immiscible. Let us denote 
by X = X(x, t) the Lagrangian flow map. Let the Eulerian Indicatrix function 
X G jBF(i?,{0,l}) represent the characteristic function of the domain Qq 
occupied by the gas. We set x(x, t) = 1 for x G Qg and x(x, t) = 0 otherwise. 
The connection between the Lagrangian and Eulerian formalism is provided 
by the relation 

^tX(x, t) = V (X(x, t), t) , for all X G t > 0, (1) 

where v(x, t) is the velocity field in the Eulerian representation. This equation 
can be solved for a smooth v by the Cauchy-Lipschitz theorem. The required 
immiscibility has the form 

X (X(x, t),t) = x(x, 0) = X°(x), for all x e t> 0. (2) 

The transport equation for the Indicatrix follows immediately from the immis- 
cibility requirement (2) and from the equation (1). Namely, 
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0=|x(X(x,0,.) ( 3 ) 

= 9tx(X(x, t),t)+ v(X(x, Vx(X(x, X G J7, t > 0. 

The symbol V denotes gradients computed with respect to the configuration 
at t = 0. 

The free surfaces are characterized by the set 5(Vx(-,0), denoting the 
points of discontinuities in the gradient of x- The transport equation (3) can 
thus be written formally (note that Vx lives in the dual to L°°!) as 

^tX(x, t) 4- v(x, t) • Vx(x, = 0 on the free surface 5(Vx(-, 0)- (4) 

The explicit dependence on the unknown, implicit, surface 5(Vx(-,0) 
be circumvented by introduction of the so-called level set function G, [11]. 
Namely, let G(x,t) be a smooth function measuring the distance to the in- 
terface, being positive inside of a gas domain, and let 6 denote the Heaviside 
function. Then x = ^ ° Substituting this representation of the Indicatrix 
into the transport equation (4) yields 

So{G){dtG + v^VG)=0. (5) 

Assuming that the velocity field is known for any x G 17, we obtain from (5) 
the following level set formulation, [11], 

dtG{x, t) + v(x, t) • VG(x, ^) — 0, X G 17, t > 0. (6) 

Our goal is to derive a variational principle for x- other words we strive 
to derive a Helmholtz free energy depending on x- A similar approach effort 
within the field models is reported in [?]. The selfcontained, i.e., independent 
of the subsequent velocity and pressure fields, evolution equation for the In- 
dicatrix is then obtained as the Euler-Lagrange equation of the Lagrangian of 
the system. The key difference between the approach leading to (6) and our 
theory is that we look at the evolution of the implicit (free) surfaces as the 
Monge- Ampere transport problem. 



2 Generalized Least Action Principle 

The Lagrangian of our system includes the kinetic, potential, and surface en- 
ergies. We assume that the density of the liquid and the density of the gas are 
constant. Thus 



Po (x) = p{x,0) = p{X{x,t),t). (7) 

We assume that the density p is given by 

p(X(x,^),i) =pGX(X(x,t),t) +pl(1 -x(X(x,t),i)), x£f2,t>0, (8) 
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where pc^ Pl are the density of the gas and liquid, respectively. We consider 
a gas-liquid system with total energy of a Mumford-Shah type, [9], given by 

I po(x) dx + E{X,t), 

where 

^ (^) 
E(X,t) = J po(x)g-X(x,i)dx + a||Vx(-,OII(^2), and 
n 

x{X{x,t),t) = x(x,0). 

The surface energy, a ||Vx(-,t)|| (i7), i.e., the total variation of Vx, represents 
the perimeters of the subdomains occupied by the gas multiplied by a sur- 
face tension coefficient a which is a positive measured quantity. The vector g 
represents the gravitation force field. 

We apply a generalization of the Principle of Stationary Action, [8], to 
the Lagrangian (9). The equlibrium dynamics obtained from this principle 
are given by the Euler equations with an added dissipative term. Moreover, 
the pressure drop across the gas-liquid interfaces satisfies the Laplace- Young 
equation. To account for the dissipative effects, we define the action as 

A{X,tx,t 2 ) t 2 >ti> to, (10) 

Jti 

where A > 0 represents the Rayleigh’s friction dissipation coefficient, and to > 
0 is a given initial time. Then 



X^A{X,h,t2) ( 11 ) 

represents a functional on the configuration space M^, given by 

>i?lXis one-to-one and onto, volume preserving, 

separately C^-maps in both the gas and the liquid}. 

( 12 ) 

We require the action to be stable on the flow maps X(.,t) G M with re- 
spect to variations which are compatible with the incompressibility and the 
immiscibility constraints. Therefore we consider a family of maps X^- preserv- 
ing the Lebesgue measure such that Xo = X. We set 




^ The configuration space is not a linear function space. Consequently, the differ- 
entiation has to be considered in the tangent space to M where summation is 
allowed. 
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Hence, Y G Tx(.,t)M. The tangent space to M at the point X(.,t) is given by 
Tx(.,t)M - {Z o X(, t) I Z G TidM, X(, t) G M}, (13) 

where X(.,t) : x i— > X(x, t), and 

TidM — {Z : i? I div Z(x) = 0 for x G i^L, 

Z • n = 0 on (9i7, (14) 

|Z • nj = 0 on DQl H OOg}- 

The jump in Z • n is the usual difference of the values from the different 
sides of the gas-liquid interfaces. The structure of the linear space TidM, i.e., 
the divergence- free condition, is, by the Liouville theorem, equivalent to the 
preservation of the Lebesgue measure. Moreover the variations must satisfy 

Y(.,ti)=Y(.,t2)=0. (15) 

We implement the Principle of Stationary Action by requiring the Gateaux 
derivative of the action to vanish on variations Y G Tx(.,t)M satisfying (15), 
i.e.. 



dA(X, ti, ^ 2 , Y) == 0, for all Y(.,t) G Tx(.,t)M satisfying (15). (16) 

We find that (16) translates into the following problem: 

Find X{.,t) e M, G L2 ([0,T], (Tx(„t)M)*), such that 

Q 

T 

+ j j p(X{x,t),t)g-Y{yi,t)dxdt 
° n 

T 

+ a [ [ H{X{s,t))Y{s,t)-n{X{s,t))dSdt^O, 

Jo JdQa{0) 

for all Y, G ([0, T], Tx(„t)M) , Y(.,0) = Y(.,T) = 0. 

(17) 

Here, H denotes the mean curvature. The space (Tx(.,t)M)* denotes the dual 
to Tx(.,t)M. The duality is given by the Riemannian metric induced by the 
kinetic energy. Consequently, we can identify the dual space with itself. 

3 Evolution of free surfaces and Monge- Ampere 
Transport Problem 

As the first step towards the sought after variational principle, we discretize in 
time the Lagrangian 8 by the implicit Euler scheme to proceed from a given 
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state X'' to The dissipative effects are included in the definition of 

the generalized action, but they are not included into the Lagrangian itself. 
To overcome this difficulty, we rescale the time coordinate. Namely, let X be 
given by 






Then, using the substitution ^ ^ r, 

rt2 



where 



def e 



Mt-to) _ ^ 



A 



ie^(t to) y po(x) ^^x(x,t)^ dxdt 



dx (Ar + 1)^ dr. 



(18) 



(19) 



Similarly, 



= j E (X(x, r) j dr. 



( 20 ) 



Let Tk = kAr, and let X^(.) X(.,/jAt). In accordance with the PSA, 

we assume that X^+^, X^~^ are given states. We are looking for X^ which is 
a solution of the minimization problem 



inf < (Ar/c 4- 1)^ | / Po (x) F ^X^ dx + (A r)^ E (x^ | X G M } , where, 

I n 

F (x) X(x) - X''+i(x) ^ + X(x) - X*-i(x) ^ . 



( 21 ) 



Since 



A r : 



g-^(^fc + l ^o) ^ ^o) 



oM^k—to) . 



pAAtfc 



A 



A 



we have for A 1 

At ^ Atk. (22) 

Hence, going back to the original time coordinate and using the above approx- 
imation, we obtain 



X'= = Arginf i / po(x)F (X) dx + {Atkf E (X) . 
xgm J 
n 



( 23 ) 
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Remark 3.1 (Weak form (17) and the variational principle (23)). The 

variational formulation (23) is obtained from the PSA applied to the time 
interval {tk-i^tkJ^{) by using the trapezoidal rule to provide an integration rule 
in time. Computing the Gateaux derivative of the functional appearing in (23) 
and using the discrete integration by parts in time, we recover the implicitly in 
time discretized weak formulation (17). □ 

Remark 3.2 (Time step Atk)> It follows from the above calculations that 
if \<^1 then, from (18) and (22)^ 



Atk 



At 

Xk At + 1‘ 



Thus, lim/c_^+oo Atk =0. □ 

We first state the following result before we proceed to make the link be- 
tween the minimizer of the variational problem (24) below and the flow maps. 

Theorem 3.3. Let us assume that Q is bounded with piece-wise smooth bound- 
ary, and that G X are given. Then there exists a unique minimizer 

of the following variational problem: 

inf {Vwix, X*'-') + Ee{x) I X e 3 c} , (24) 

We use the following Definitions 



+ dpcdistw- (x,X^“^)^ + 5PLdistvi^ (l - X, 1 
Ee{x)=^J Po(x)g-xdrr + Q:||Vx|| (f?), 

f2 



p(x) PGX(x) + Pl(1 - x(x)), 



X {x e BV{Q, {0, 1}) 3X e K (x°, x) : 



/ x(x) dx= X^(x) dx, V open u C 17} . 

Juj Jy.-\uj) 



Proof. The proof follows from the strict convexity of the Wasserstein distance 
on the relaxed space CICreiaxed which is defined as % but with RF (17, [0, 1]) as the 
base space. The existence of the unique minimizer of (24) is then obtained by 
the density argument, which follows from the compatibility of the Wasserstein 
metric with a convergence in the L^— spaces. The proof can be found in [7]. □ 



Remark 3.4. We recall that the Monge-Ampere problem is formulated as 
follows 
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(25) 



where 

{V0 G ^ (x° 5 X^) I 4^ Lip. cont. and convex, Vcj) invertible for a. a. x G f?}, 

£'(x°,x') = |Mea(x°,x') 

/ x^{^)dx = / Borel subsets lu of Q 

Jm-^{u>) Jlo 

3 (x°,x^) ='^ 

{M : {x° > 0} C i? {x^ > 0} C i? I M measurable}. 

(26) 

□ 

The existence result of Theorem 3.3 serves as a basis for showing the solv- 
ability of the variational problem (23). In particular, it follows from [7] and 
[12] that the minimizer has the form 

(27) 

where is the unique solution of the Monge- Ampere problem (25). More- 

over, the link mentioned above is provided by 

x" (X'=(x)) = X°(x), X e (28) 



4 Dynamical System 

In order to obtain a dynamical system for the evolution of the Indicatrix, we 

perform two steps. First, we compute the Gateaux derivative of (24) and then 

we, formally, take a limit as At — > 0+. Obtaining the Gateaux derivative is a 

delicate task, see [7] for complete details. 

Let X^’^+^ G M be the optimal flow map of x^ lo Similarly, let 

Xr G M be a family of optimal maps bringing x^ onto Xr such that Xr I = 

I r=0 

X^’^+1. We expand X^- around the point r = 0. Up to the first order in r, we 
have 



(x) = X'=’''+I(x) + (X''’*+I(x)) 



(29) 



Since Xr G M we have 




(30) 



576 P. Kloucek et al. 

d /xfc.fc+i(x))) I € Tx.,.+iM. 
dr ^ ^ lr=o 

Consequently, in view of (13), 

div -^Xr(X^’^“^^(x))| = 0 for a. a. x G i?. (31) 

dr I T=o 

Hence, we consider all possible variations of the form 
^Xr (X'=’'=+i(x))) = (V/ o h) (x), where 

h G M/e, / G (i?,M^) , and div(V/(h(x))) = 0 for a.a. x G 12. 

(32) 

It follows from [7] that for any / G (j?,R^) there exists h G M/e such 
that the divergence free requirement in (32) is satisfied. Thus, we require 
to solve 

forall/ehFi’^(r2,M^). 

The fundamental step is to show that within 0 (At) we have 

i dist^(x. x''-)^ 1^^, + ^ i dist..(x. X- Y 

« I (x"+'(x) - 2x'=(x) + x'=-'(x)) /(x) dx 

n (34) 

- / (x - /i(x)) • V/(h(x))x*^(x) dx- j (/(h(x)) - /(x)) x'"(x) dx 

Q Q 

for all / e (f2, R^) , he M*. 

The last two integrals vanish due to the transport equation. After some calcu- 
lations, suitable approximations, and the limit procedures e 0+, At — ^ 0+, 
(again see [7] for details), we find that the dynamical system is given by a de- 
generate equation 



PG (9ttx(x, t) - Adtx(x, t)) = div (x(x, t) {{pa - Pl) g + aVH{x, t))) , 
xe5(Vx), ^>0. 



Remark 4.1. The sequence of solutions obtained by solving the discretized 
version of (35) is a sequence of approximate minimizers of (24). Hence, this 
sequence is written with respect to a moving frame given, implicitly, by the 
optimal flow maps. The connection between the successive solutions is given 
by 
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^/=+i(V^M+i(x)) xer?, (36) 

where is the solution of the Monge- Ampere transport problem. This 

relation has three implications. First, it shows that ^*5 l^he Eulerian rep- 
resenter of ’ The previous time level solution, x^ ) written with respect to 

the Lagrangian frame of reference. Secondly, it follows from (35) that we do 
not need to convert the two systems. Instead, we can use directly in the 

fixed, Eulerian frame of reference. 

The last implication of (36) concerns the use of the solutions of (35) 
with other differential equations, say incompressible Euler equations, semidis- 
cretized in time. The purpose of computing is both the tracking of the phase 
boundaries and the computation of the evolving density. We have in the La- 
grangian frame of reference 

PLagrange{^i k t) = PGX (^) PL (l X (^)) • 

Since PLagrange{y^T^ = P Euler iS ipT) ^k /Xt) , it folloWS from (36) that 

the appropriate form of the density passed to the Euler equations written in 
the fixed frame of reference, in the time- semidiscrete form, is the following 

PEuler{x, kAt) = + PL (l - X*"‘^^(x)) - (38) 

In other words, the tracking has to be computed one At ahead of the solution 
of the Euler equations. □ 



5 Appendix: Wasserstein Distance 



The Wasserstein distance, denoted distvt/(., O? is defined by, [13], [2], 
distvy(so,si)^ inf ff |x - y|^ d/i(x,y). 

/^OTTx ^=S0, J J QXQ 



(39) 



In words, the infimum in (39) is taken over bistochastic measures with 
marginals sq and si, respectively. This means 



j j <p(x) d/i(x, y) = J p(x)so(x) dx, 

2xi7 S7 

jj 'P(y) o!p(x, y) = J ‘p{y)si{y)dy, 



for all (p G C{0), 



for all (p G C(i7). 



(40) 



We assume that the functions Sq and si are bounded, non- negative, measur- 
able functions with compact support. It follows from [Section 3, [2]] that the 
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infimum in (39) is attainable and, c. f. [Proposition 3.1, [2]], that the unique 
optimal measure /i* has the form 

/i*(x,y) =(^(M*(y) -x)so{x)dx. (41) 

It can be shown, [Proposition 3.1, [2]], that the representating map is almost 
everywhere equal to the gradient of a convex Lipschitz continuous scalar func- 
tion which has an, almost everywhere, invertible gradient. Substituting the 
measure /i* back into (39), we obtain an alternative formulation of the Wasser- 
stein distance more closely related to the Monge-Kantorovich problem, ([10], 
[4], [1], [5], [3], [6], [14]). Namely, if the marginals have the same mass, i. e. , 
sq{x) dx — 5i(x) dx^ it follows from the above discussion, that there exist 

(/)*, convex and Lipschitz continuous, such that G L(so, 5i) and invertible, 
solving 



inf < J sq{x) \M{x) —xf dx M e L(sq, 5 i),M invertible > , where 



^(^ 0 , si) = {M : {sq > 0} C i? h -4 {si > 0} C i? I M measurable}, 
-C(so,si) {M e 3(so,si) [ so{x)dx 

= / 5i(x) dx, for all Bor el subsets uj of i?}. 

J UJ 



distw(so,si)^ = J x°(x) |V(?!>*(x) - x|^ dx. (-43^ 

n 

We note that the requirement of the transport of sq onto si by a volume 
preserving map can be written in a form 

j s^{x)ip{x)dx = J so{x.)(f{M{x))dx, for all e ^44^ 



which is to say that. 



for M G M. 



si(M(x)) == 5o(x), X G i7. 
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Summary. Nonobtuse tetrahedral partitions and linear finite elements guarantee 
the validity of a discrete analogue of the maximum principle for a wide class of 
parabolic and elliptic problems in the three-dimensional space. In this paper we 
propose global and local refinement techniques which produce nonobtuse face-to- 
face tetrahedral partitions of a polyhedral domain. 



1 Introduction 

The maximum principle represents one of the most characteristic features of 
solutions of second order elliptic or parabolic problems. In this paper we sur- 
vey some of our recent results concerning the discrete maximum principle for 
three-dimensional nonlinear elliptic boundary-value problems solved by lin- 
ear tetrahedral finite elements. In particular, we introduce sufficient geometric 
conditions on tetrahedral meshes that guarantee the validity of the discrete 
maximum principle, and present algorithms for global and local refinements of 
meshes preserving these geometrical properties. 

Linear tetrahedral finite elements are commonly used for solving second 
order boundary value problems, since they do not require a high regularity of 
the solution. The structure and properties of the associated stiffness matrix 
essentially depend on the dihedral angles between faces of tetrahedral elements. 
To see this fact, let us consider an arbitrary tetrahedron ABCD. Let p and q 
be two linear functions such that 

p{A) = 1, p{B) = p{C) = p{D) = 0, 
q{B) = 1, q(A) = q(C) = q{D) = 0. 

Then a straightforward calculation leads to the following formula (see [11, 
p. 63] for the proof) 



Vp • Wq 



meas2 AC Dme8iS2BC D 
9 {meass ABC Dy 



cos a. 



( 1 . 1 ) 
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where a is the angle between the faces ACD and BCD (see Fig. 1) and the 
symbol meas^z stands for d-dimensional measure. Scalar product (1.1) is, there- 
fore, independent of all the other 5 dihedral angles of the tetrahedron ABCD. 
If Of > 7t/2 then the scalar product (1.1) is obviously positive. Hence, each 
obtuse dihedral angle of the tetrahedron ABCD gives a positive contribution 
to the corresponding off-diagonal entry of the element stiffness matrix, when 
solving a boundary value problem with the Laplace operator by the finite 
element method. 



B 




Fig. 1. A tetrahedron whose two faces include the angle a 



Note that the same is also true (see [11]) for a wider class of nonlinear 
elliptic problems of the form 

—W'{X{x,u,Wu)Vu) = f{x) in 17, (1.2) 

== 0 on dO, (1.3) 

where A is a positive smooth function and i? is a bounded polyhedral domain 
with Lipschitz boundary df2. Equation (1.2) describes, for instance, a station- 
ary nonlinear heat conduction or magnetic potential in ferromagnetic media 
(cf. [13]). 

According to [5, p. 206], problem (1.2)-(1.3) satisfies the maximum princi- 
ple^ i.e., if / < 0 then the maximum of u over Q is attained on the boundary 
dfl (see also [14]). It is thus natural to look for a class of finite elements such 
that the same implication is satisfied. Note that, e.g., for bilinear rectangular 
elements the discrete maximum principle can be violated (see [1, p. 254]). The 
same is true also for trilinear block elements (a simple example is given in [12, 
p. 562]). However, if we decompose these elements into nonobtuse triangles or 
tetrahedra keeping the number of degrees of freedom, the discrete maximum 
principle will hold. 
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2 Nonobtuse tetrahedralizations 



A tetrahedron is said to be nonobtuse if all six dihedral angles between its 
faces are less than or equal to tt/2. By a tetrahedralization we shall mean 
a face-to-face partition of i? into tetrahedra in the standard sense (see [3]). 
A tetrahedralization is said to be nonobtuse if it contains only nonobtuse tetra- 
hedra. 

By [3, p. 150], linear triangular nonobtuse elements guarantee the validity 
of the discrete maximum principle in the plane. This result is generalized into 
three-dimensional space in [11], namely, linear tetrahedral elements applied to 
problem (1.2)-(1.3) on nonobtuse partitions yield irreducibly diagonally dom- 
inant stiffness matrices (whose all off-diagonal entries are nonpositive). It is 
well known (see [15, p. 85]) that such matrices are monotone. Nonobtuse tetra- 
hedral partitions thus guarantee the validity of the discrete maximum principle 
for problem (1.2)-(1.3) solved by linear elements, i.e., we have Uh <0 provided 
/ < 0, where Uh is the continuous and piecewise linear Galerkin approxima- 
tion of the solution of (1.2)-(1.3). In other words, Uh attains its maximum 
on the boundary df}, if the homogeneous Dirichlet boundary conditions are 
considered. Note that nonobtuse tetrahedralizations enable us to prove the 
convergence of finite element approximations to the weak solution (see, 
e.g., [4]). 

According to [7], an arbitrary polyhedron can be decomposed into nonob- 
tuse tetrahedra. In order to improve the discretization error a given partition is 
refined locally or globally. That is why, the issue of preserving the nonobtusity 
appears during the refinement process. 

In [6] we give a global refinement procedure yielding nonobtuse tetrahe- 
dra over the whole domain. This procedure is briefly described in Section 3. 
However, such a technique requires a large amount of computer memory to 
store the associated stiffness matrix. Therefore, in Section 4 we introduce a lo- 
cal refinement procedure yielding nonobtuse partitions that refine only near 
a given vertex, where a singularity of the exact solution may appear (see, e.g., 

[2], [8], [9]). 



3 Global refinement techniques 

Definition 3.1. A tetrahedron is said to be a path tetrahedron if it has three 
mutually perpendicular edges which do not pass through the same vertex. 

The reason for the name of the above tetrahedron is that its three perpen- 
dicular edges form a “path” (see Fig. 2). 

Proposition 3.2. Any path tetrahedron is nonobtuse. 

For the proof see [6, p. 728-729]. 

In Fig. 2 we observe a typical shape of a path tetrahedron (all its right 
angles, solid and dihedral, are indicated there). 
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A 




Fig. 2. A path tetrahedron 



Theorem 3.3. Let T be an arbitrary tetrahedron such that its circum- 
centre belongs to T, and let all faces of T be nonobtuse triangles. Then there 
exists a family of tetrahedral partitions of T containing only path tetrahedra. 

This theorem is proved in [6]. Note that each path tetrahedron satisfies 
the assumptions of Theorem 3.3, since its faces are right triangles and its 
circumcentre is the midpoint of the longest edge. In Fig. 3 we see a partition 
of a tetrahedron T, that satisfies the assumptions of Theorem 3.3, into path 
tetrahedra. They are defined in the following way. First we divide each face 
F of T into 6 or 4 right subtriangles by connecting the circumcentre of F 
with 3 vertices and 3 midpoints of sides of F. The common vertex of these 
subtriangles is the circumcentre of F. This kind of plane refinement we call 2d 
yellow (see [6]). Denoting the circumcentre of T by G, we can define the path 
subtetrahedra as the convex hull of G and particular right subtriangles on the 
surface of T (compare with Fig. 3). We call such a kind of three-dimensional 
refinement 3d yellow. 



The advantage of the above approach is that a common face F of any two 
adjacent tetrahedra (satisfying the assumptions of Theorem 3.3) in a given 
tetrahedralization of a polyhedral domain is divided in a unique way. This 
enables us to develop global refinement techniques yielding only nonobtuse 
subtetrahedra (see [6] for details). 



4 Local refinement techniques 

The main idea of generating local refinement technique producing only nonob- 
tuse partitions is exposed in the following theorem. 
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Fig. 3. Nonobtuse partition of a tetrahedron into path subtetrahedra 



Theorem 4.1. Let ABCD be a path tetrahedron whose edges AB, BC, 
and CD are mutually perpendicular. Then there exists an infinite family of 
nonobtuse tetrahedralizations of ABCD into path tetrahedra that locally re- 
fine ABCD in a neighbourhood of the vertex A. 

For a detailed constructive proof see [2] or [9] . Its main idea is sketched on 
Fig. 4. Using several sophisticated orthogonal projections, we first subdivide 
the tetrahedron ABCD into five nonobtuse tetrahedra. Then we show that 
the path tetrahedron ATSQ from Fig. 4 is similar to the original tetrahedron 
ABCD. The subtetrahedron ATSQ can be now decomposed into 5 subtetra- 
hedra in a similar way as ABCD^ and thus we can get further refinement near 
the point A. In this manner, we obtain recursively the required infinite family 
of nonobtuse tetrahedralizations. 

An algorithm for a local refinement of a cube producing only nonobtuse 
tetrahedra is presented in [8]. If several cubes meet at one point, then we can 
apply this algorithm to each of them so that the whole partition remains face- 
to-face. For instance, in Fig. 5 we see such a local refinement a polyhedral 
domain i? = (— \ [0,1)^ which presents the union of 7 cubes. Each cube 
is first divided in a standard way into 6 path tetrahedra having a common 
vertex in the reentrant corner of Q. For each of the 7 x 6 = 42 tetrahedra 
we apply the algorithm given by Theorem 4.1 such that the partition of each 
path tetrahedron is just the mirror image of the partition of any adjacent 
tetrahedron having a common face. Note that the concave (reentrant) corner 
in Fig. 5 is called the Fichera corner or the Fichera vertex. 
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A 




Fig. 4. Partition of a path tetrahedron into 5 path subtetrahedra 




Fig. 5. Local refinement technique producing nonobtuse tetrahedra near the Fichera 
corner 



5 Concluding remarks 

Nonobtuseness of all dihedral angles of all tetrahedra in the partition repre- 
sents only a sufficient condition to guarantee the discrete maximum principle 
for linear elements. In [10] we present a weakened condition for the shape of 
tetrahedra, which enables us to use tetrahedra, some of whose dihedral angles 
are slightly bigger than 7t/2. This condition is also only sufficient to get a 
monotone stiffness matrix and thus also the validity of the discrete maximum 
principle. 
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Summary. The paper deals with a posteriori error estimation in terms of special 
problem-oriented quantities, represented as a linear functionals that control the be- 
havior of a solution in certain subdomains, along some lines, or at especially inter- 
esting points. The method of estimating such quantities is based on the analysis of 
the adjoint boundary- value problems, whose right-hand sides are formed by the con- 
sidered linear functionals. On this way, we propose a new effective approach based 
on two principles: (a) the original and adjoint problems are solved on non-coinciding 
meshes, and (b) the term presenting the product of gradients of errors of the pri- 
mal and adjoint problems is estimated by using the “gradient averaging” technique. 
The model problem of elliptic type is analysed and the results of numerical tests are 
presented. 



1 Introduction 

A posteriori error estimates play an important role in modern numerical analy- 
sis. For finite element methods, they are usually obtained either by estimating 
a weak norm of the residual (see, e.g., [1], [2], [3], [4], [5], [12]) or by using 
special post-processing procedures (see, e.g., [12], [13]). For Galerkin approxi- 
mations of linear elliptic problems, they estimate the error in the global (en- 
ergy) norm and also provide an error indicators that are further used in various 
mesh adaptive procedures. Global error estimates give a general presentation 
on the quality of an approximate solution and a stopping criteria. However, 
for the engineering purposes, such an information is often not sufficient. In 
many cases, analysts are mainly interested not in the value of the total error, 
but in errors over certain subdomains, lines, or at special points. A possible 
way of estimating such errors is to introduce a linear functional i associ- 
ated with a “quantity of interest” and to obtain an estimate for the value of 
< £^u — V > , where u is the exact solution and v is the approximate one. 
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Known methods (see, e.g., [1], [6], [11]) find estimates of < > for 

a Galerkin approximation Uh by employing an additional (adjoint) problem, 
whose right-hand side is formed by the functional I . If the Galerkin approx- 
imation of the adjoint problem is computed on the same mesh as Uh^ then 
the functional < i^u — Uh > is expressed via a certain integral functional, 
which can be estimated by using, e.g., “equilibrated residual method” (see, 
e-g-, [1], [11])- 

In the present work, we propose a new way of estimation of “quantities of 
interest”. It is based on two principles: (a) the original and adjoint problems 
are solved on non-coinciding meshes, and (b) the term presenting the product 
of errors arising in the primal and adjoint problems is estimated by the gradient 
recovery technique widely used in various applied problems (see [7], [10], [12], 
[13]). This differs our approach from others, where it is usually assumed that 
the Galerkin approximations of the primal and adjoint problems are computed 
in the same finite dimensional subspaces. 

The effectivity of the method suggested in this paper, strongly increases 
when one is interested not in a single solution of the primal problem for a con- 
crete data, but analyzes a series of approximate solutions for a certain set of 
boundary conditions and various right-hand sides (such a situation is typical 
in the engineering design when it is necessary to model the behavior of a con- 
struction for various working regimes). In this case, the adjoint problem must 
be solved only once for each “quantity of interest”, and its solution can be 
further used in testing the accuracy of approximate solutions of various primal 
problems. 



2 General scheme 

Introduce Hilbert spaces H and Y with scalar products and norms 

= \\y\\Y = {y,yW\ 

and the Banach space K, which is continuously embedded into iJ, with the 
norms denoted by || • ||v. Let A G C{V,Y), A G C{Y,Y), and 

CiIIj/IIy < (.4t/,J/)y < C2||y||y '^yGY, C3||w||y < ||ylM)||Y VweVo, 

( 2 . 1 ) 

where Vq is a subspace of V and ci, C 2 , cs are positive constants. Given / G V^*, 
consider the following problem: 

Primal Problem (V): Find u G Vq, such that 

{AAu,Aw)y — < f,w > Vu; G Vo, (2.2) 

where < • , • > denotes the duality pairing of the spaces Vq and Vq . 

Let £ be another element of V^*, which forms the “quantity of interest” 
< — u >, for an arbitrary element u e Vq viewed as an approximation 
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of u. In order to estimate the above defined quantity, we introduce another 
problem. 

Adjoint Problem (Va)’ Find v eVo, such that 

{A^Av,Aw)y = < £,w > \/w e Vo, (2.3) 

where A* is the operator adjoint to A, i.e., {Ay,z)Y = {y,A*z)Y Vy, 2 : G Y. 

In the friequently encountered cases of the selfadjoint operators, the left- 
hand sides of (2.2) and (2.3) coincide and both the problems are associated 
with a functional of the type 

J{w) = ^{AAw,Aw)y <fi,w>, 

which is known to have a unique minimizer on Vq for any jj, E Vq. 

Proposition 1. Let u and v he solutions of problems {V) and (Va), respec- 
tively. Then for any u,v eVq 

< i,u — u >=: E{u, v) — Eq{u, v) + E\(u, v), 

where 

Eo{u,v) = < /, u > —(AAu,Av)y, 

and 

Ei{u,v) — {AA{u — u), A{v — v))y ^ 

The proof can be found in [8]. 

The term Eq(u,v) is explicitly computable, whereas the term Ei{u,v) 
contains unknown solutions of Problems (V) and (Pa)- Evidently, the term 
Eo{u,v) dominates if v is ’’sufficiently close” to the exact solution v of Prob- 
lem (Pa)- Really, if u ^ u in F, then v v in H, and A(v — u) 0 in y, so 
that 

Ei{u,v)—^0, and Eo(u,v) — u — u>, 
i.e., Eq{u,v) contains the major part of the quantity of interest. 

Let Vh and VV be two finite-dimensional subspaces of T^, and let u = Uh, 
u = Ur, where Uh and Vr are solutions of the problems 

{AAuh, Awh)Y — < f^ujh > G Vh, (2.7) 

{A*AVr,AWr)Y = < > ^Wr G Er- (2.8) 

In a particular case of Vh = Vr, the term Eo{uh, Vr) = 0 due to the orthog- 
onality condition in problem (2.7). Thus, if meshes in problems (2.7) and (2.8) 
coincide, then the estimate has only one term containing the product of the 
(unknown) energy errors. On the contrary, an usage of non-coinciding meshes 
leads to another estimate that has two terms. 



(2.4) 

(2.5) 

( 2 . 6 ) 
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Finding a sharp approximation of the adjoint problem may require high 
computational costs. A more economical way is to use an approximate so- 
lution of the adjoint problem having approximately the same quality as the 
approximate solution of the primal one and recover unknown functions Au 
and Av by some post-processing techniques. Let Gh and G^- be certain aver- 
aging operators defined on Vh and Vr^ respectively. We replace E{uh^Vr) by 
the directly computable functional 

Eq(uj^^ ^ (^•^) 

where _ 

Ei{uh,Vr) = {A{Auh - Gh{Auh)), {Avr - Gr{Avr))). ( 2 . 10 ) 

If the operators Gh and Gr provide proper recovery of Au and^Ai;, then 
it is natural to await that the difference between and Ei{uh,Vr) 

is presented by the higher order terms and, thus, the latter quantity can be 
succesfully used instead of Ei. This observation is justified theoretically and 
numerically in what follows for a model problem of elliptic type. 



3 Model elliptic type problem 

3.1 Formulation of the problem 

We define the operator A as V := and set 

y = H = 12(0), V = W2^(0), Vo* = 

Consider the following problem: find u satisfying the system 

—div{AVu) = f in i7, (3.1) 

u 0 on dQ. (3.2) 

In the above, i? is a bounded and connected domain in with a Lipschitz 
continuous boundary dQ^ the symbol v denotes the unit outward normal to 
the boundary, the matrix of coefficients A = is symmetric and 

meets the conditions 

aij(x) e L^(0), e Va; e 77, (3.3) 

where the dot denotes the scalar product in R^. Also, it is assumed that 

/ 6 L2(0), (3.4) 

Hereafter, various constants, independent of h and r, are denoted by one let- 
ter G. 

Now, Problem {V) consists of finding u eVo such that 
J AVu • Vw dx = J fw dx Ww G Vo, 

f2 Q 



(3.5) 
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where Vo =w\[Q). Let u ^Vq he an approximation of u. Assume that we are 
interested in the value of 



< i^u — u >= 




— u) dx^ 



(3.6) 



where supp (p C uj C Q. Estimates of such quantities for one (or several) 
a priori given functions (p give an information about the behaviour o^ u — u 
in uj. 

In this case, Adjoint Problem (Va) consists of finding v E Vq such that 

J AWv • Vw dx =< £,w > Vu; G Vq. (3.7) 

Suppose that G Vq is an approximation of v. Then, for problems (3.5) and 
(3.7), the relation (2.4) reads as follows: 



where 



<i,u — u>= Eq{u,v) + E\{u,v), 


(3.8) 


Eo{u,v) = / fvdx — / AVv • Vu dx, 


(3.9) 


J J 

Q Q 




i[u,v) = [ AV{v — Vf ) 'V{u — Vu) dx. 


(3.10) 






Let Vh and Vr be two finite-dimensional subspaces in Vq, constructed by 
the Courant type finite element discretization. The respective Galerkin ap- 
proximations Uh and Vr are formed by piecewise- affine continuous functions 
and defined on the domains 



U n, \j n (3.11) 

where the respective triangulations are denoted by Th and 7^, and their el- 
ements are denoted by and , respectively. Consider the corresponding 
finite dimensional problems (we assume for a moment that Qh = = ^)'- 

Problem {V^)\ Find Uh G Vh such that 



J AVuh • Vtc/i dx — 




dx 



ywh e Vh, 



Problem {Va)> Find Vr G K such that 



/ 



AVVr • VWr dx =< i,Wr > 



Set u = Uh and v — Vr. Then 



(3.12) 



\/Wr G Vr. 



(3.13) 
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£^u Uh — . E(uh^ — Eq(^uii^ Vq-^ -\- Ei(^U]q^ Vq-^ . 

On Th, we define the gradient averaging operator Gh : Loo(>Oh,R^) 

R^) as the operator that defines a vector- valued piecewise linear 
function by setting each nodal values as the mean values of Vuh on all el- 
ements incident with the corresponding nodal point. The averaging operator 
Gr : Loo(f^r,R^) is defined in the same manner. Nowadays, 

averaging operators of this type are widely used in the mathematical modelling 
(see, e.g. [1], [4], [7], [10], [12]). 

Averaging near the boundary {Qh ^ ^ requires a more complicated 

analysis. However, in the literature this question was investigated (see, e.g. 
[7], where concrete forms of the averaging operators near the boundary are 
presented). In our further considerations, we use the results of the above pa- 
per and, therefore, impose the same conditions on the problem data and the 
structure of meshes. Namely, we assume that dQ belongs to the class C^, the 
coefficients aij are smooth and the sets Qh and defined in (3.11), are such 
that 



17^ C i7r C i?, dist {Qh, < ch^, distlQr,^^} < (3-14) 

Further, we assume that functions Uh and Vr as well as the averaged gra- 
dients GhiS'^h) and Gr(Vvr) are extended by zero on Ooh \ f2h and 

i?0r 0 \f2r. 

For any pair (wh,Wr) G x F,-, we define the following functional 

E{Wh,Wr) := Eo{Wh,Wr) Ei{Wh,Wr), (3.15) 

where 

Ei{Wh, Wr) ^ ^ - Gh{Vwh)) • {VWr ~ Gr(VWr))dx. (3.16) 

Oh 

The functional E(uh^Vr) is directly computable once the approximations 
Uh and Vr are defined. Our aim is to show that E{uh,Vr) is a simple and 
effective estimator of the quality < i,^u — Uh >. 

We can easily show that 

\EQ{uh,Vr)\ < c h\u\ 2 , 2 ,n, (3.17) 

where \u\ 2 ^ 2 ,q denotes the L 2 -norm of the second derivatives of which exist 
due to the conditions imposed on Problem {V), see below. 

The estimate (3.17) is sharp, in the sense that it gives a correct asymp- 
totical order of the term jFq, whereas the terms Ei{uh-,Vr) and Ei^Uh^Vr) are 
asymptotically smaller ones. Really, 

\Ei{uh,Vr)\ < c hT\u\ 2 XQ\v\ 2 XO', (3-18) 



and 
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\Ei{uh,Vr)\ = I f A {Vuh - Gh{'^Uh)) ■ (Vwr - Gr(VWr)) dx\ < 

Oh 

< C(|| Vuh — Vm ||2,fi + II Vu — Gh{Vufi) ||2,«h)x 

X (II Vwt - Vv \\ 2 ,n + II Vt; - G^(Vvr) 

Therefore, if the superconvergence of the averaged gradients takes place, 
then the first terms in the round brackets dominate and, therefore, Ei{uh,Vr) 
has the same asymptotic order hr. 

From the above, we see that asymptotically Eo{uh,Vr) contains the main 
part of the quantity < i,u — Uh >, except the case Vh = Vr^ when this term 
vanishes. 

It remains to compare Ei{uh,Vr) and Ei{uh,Vr). We do this under addi- 
tional assumptions used in [7]. Namely, it is assumed that / and £ belong to 
1^2 (17), so that u,v E W 2 (17) and that triangulations Th and %- are composed 
of uniform elements, for which the super convergence estimates 

II Vu — Gh{Vuk) W 2 , f2h^ ^ ll3,2,/2 + II / ||2,2,r?), (3.19) 

II - GriVVr) ||2,^2.< G V ||3,2,« + || i l|2,2,r2) (3.20) 

hold. Also, we use the superconvergence of II hU and ilr't’, where II h and Hr are 
the nodal interpolation operators, that have been established (see [7]) under 
the same additional assumptions. In our case, they have the form 

II VUH - V{nhu) ||2,i2.< G + II f ||3 (3,21) 

II Vz;. - V{n^v) ||2,n.< G r3/2(|| « ||3,2,^2 + II £ ||2,2,^2). (3.22) 

Proposition 2. LetdQ G , Th andTr he two families of triangulations with 
above described properties, and u,v E WKi?). Then, for sufficiently small h 
and T (r < h), we have 

\Ei(uh-)V^^—Ei{uh’)Vj-')\ < (7 (/iT2 -|-/it(/i-|-t) 2 -l-T^)-|-ju(/i, t), (3.23) 

where m is any positive integer greater than 2, ja{h, r) contains higher order 
terms, and the constant C does not depend on h and r. 

The proof can be found in [8]. 

Let us summarize the above analysis of the behavior of Eo{uh,Vr), 
Ei{uh^ Vr)^ and E\{uh^ Vr) with respect to sufficiently small h and r {h > r): 

— the explicitly computable term Eo{uh,Vr) is the major one; 

— the terms Ei{uh,Vr) and Ei{uh,Vr) are subsidiary ones; 

— the difference between Ei{uh,Vr) and Ei{uhTVr) has higher asymptotic 
order provided that the solutions of the problems {V) and (Va) are regular 
enough to guarantee the super convergence of the Galerkin approximations. 

The above observations suggest the following numerical strategy: 
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a) Define Vr taking into an account the nature of the functional £ (e.g., by 
putting extra trial functions in a subdomain associated with it), and calcu- 
late Vr. 

b) Define 14 and calculate Uh. 

c) Calculate Eo{uh,Vr) directly and use post-processed values of Vuh and 
Vvr to estimate Ei{uh^Vr) replacing gradients by the averaged gradients. 



3.2 Numerical experiments 

In this subsection, we present numerical results for the model problem (more 
numerical tests on the subject can be found in [8]). 



Test Problem: Find u such that 



d^u 

dx‘^ 






f{x,y) in = (0, 1) X (0, 1), 
= 0 on 



where the exact solution is the infinitely smooth “hat- function” : 

r), 



u{x,y) = 



16 

-(4y-3)2 



f (l6 - l-(4x%)0 • “ T: 

if (x,y) e (0.5, 1.0) X (0.5, 1.0), 
otherwise, 



0 , 



As a quantity of interest, we take 

< i^u — Uh >= J {u — Uh) dx, where u = (0.125, 0.250) x (0.125, 0.250). 



and define the effectivity index Es == \<eu-ul>\ • 

The results of computations are presented in Table 1 below, where the sym- 
bol N X N {N = 16, 32, 64, 128) mean that the corresponding test is performed 
on a triangular mesh of thr square (0, 1) x (0, 1) consisting of 2 x A x A right 
triangles. The results demonstrate a good performance of the estimator. 



Table 1. The results of performance of the estimator E. 

Primal Adjoint Eo E\ E < £,u — Uh > /efr 

16 X 16 32 X 32 -0.00000501 -0.00000062 -0.00000439 -0.00000495 0.89 

16 X 16 64 X 64 -0.00000496 -0.00000017 -0.00000479 -0.00000495 0.97 

16 X 16 128 X 128 -0.00000495 -0.00000005 -0.00000490 -0.00000495 0.99 
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4 Final comments 

1) The computations made in the item a) can be further used for estimation of 
errors of an approximate solution obtained on another mesh and for different 
righ-hand side functions /. 

2) The approach is valid for another boundary conditions [8], is suitable for 
estimating local integral norms [8], and can be applied to problems in linear 
elasticity theory [9] . 
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Summary. The work deals with numerical testing of two different numerical meth- 
ods based on finite volumes (FV) and finite elements (FE) for different Reynolds 
numbers. The finite volume method is based on upwind scheme (third order) for con- 
vective terms and central second order for dissipative terms. Finite element method 
consists of stabilization of weak formulation for higher Reynolds numbers with the 
help of streamline-upwind (Petrov-Galerkin) modification. 

Authors compare both numerical results with experiment for laminar Re G 
(100,700), where steady solution exists, using the length of separation domain on 
lower wall as well as on upper wall for Re > 400. 



1 Introduction 

The work deals with the numerical solution of the 2D and 3D flow through 
backward facing step. The mathematical model is the system of Navier-Stokes 
or Reynolds averaged Navier-Stokes (RANS) equations with two-equation tur- 
bulence models. Concerning BFS flow in 2D, we focus ourselves on the transi- 
tional regimes from laminar to turbulent flow. The start of transition is investi- 
gated using the model of laminar flow, which allows us to make even the small 
differences between different numerical schemes visible. The end of transition 
is investigated using the RANS model. A comparison with measurement data 
is made. The laminar model is extended to 3D. 

We are also interested in comparison of results obtained with finite volumes 
and finite elements schemes. We discuss the stabilization of finite elements 
with the help of advanced SUPG method in order to obtain robust solver for 
unsteady incompressible laminar flows. 



2 Mathematical model 



The equations used to solve laminar flow of an incompressible viscous fluid are 
the Navier-Stokes ones, which in 2D and Cartesian coordinates (x, y) have the 
form 
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RWt + F^+Gy = R^^{W^^ + Wyy), (l) 

where 

W = col ||p, u^v\\^ R = R — diag ||0, 1, 1|| , 

F = uf -\-col ||0,p,0|| , G = vf -\-col ||0,0,p|| , f = col (2) 

Re is Reynolds number, u, v components of velocity vector, p static pressure 
divided by density, and co/|| • || denotes column vector. 

In the case of turbulent flow, we solve the Reynolds averaged Navier- Stokes 
(RANS) equations to obtaind mean flow- field. The RANS equations have the 
form of (1) with {u, u), p being mean values of velocity and pressure and right 
hand side replaced by 

^{W^X + Wyy)+S^+Ty, ( 3 ) 

with 

S = col ||0,rii,T2i|| , T = col ||0,ri2,T22|| , (4) 

where Tij is Reynolds stress tensor. It is approximated using low- Re two- 
equation turbulence models mentioned in Section 5. 

The formulation of laminar and turbulent 2D backward-facing step flow was 
completed by the boundary conditions in the inlet: fully developed channel 
flow, on the walls: zero velocity, and in the outlet: zero streamwise derivative 
of velocity. 



3 Finite volume method 

The system of Navier- Stokes or RANS equations is solved by means of artificial 
compressibility method, which replaces matrix R in (1) by a regular matrix. 
We have used 

R = diag \\1/Umax, 1, 1|| , (5) 

where Umax is maximum velocity in the domain. The method is thus applicable 
for solution of steady flows only. 

The Eq. (1) with (5) is solved using a cell-centered finite volume mehod. 
The mean value of W for the i,j finite volume (cell) Dij, Wij^ will satisfy 
a numerical approximation of 

R^^\Di,j \+(f Fdy- Gdx = R^ (f W^dy - Wydx, ( 6 ) 

JdDi,j 

where \Dij\ is area of the cell. The finite volumes compose structured (multi- 
block) grid consisting of quadrilaterals. The grid is orthogonal and for turbu- 
lent cases refined along walls. 
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3.1 Spatial discretization 

The integrals in (6) are approximated using mid-point rule. Cell face values of 

F, C', W,, 

Wy thus need to be defined. The face between cells Di^j and is denoted 

by 2 + 1/2, analogically the face between cells Dij and will be denoted 

by ij + 1/2. 

The discretization of convective and pressure terms consists of defining cell 
face velocities Ui^ij2j, pressure in the central way, 

1/ N 1/ ^ 

'^iJ+l /2 — Pi,j-\-l /2 — 2 ^Pi,j d" 

and an upwind biased “Monotone Upstream-centered Schemes for Conserva- 
tion Laws” (MUSCL) [12] interpolation in the direction of grid lines, here the 
line j = const ^ for cell face momentum in (2): 

fi-\-l/ 2 J = fi + ^(1 + “ fi) d" ^(1 “ ^){fi — '^i+1/2 > O5 

/i+l/2,j = /i+1 d- -(1 + n){fi-\-l — fi) + -(1 — K,){fi ^2 — /i+l),'^i+l/2 ^ 0/"^) 

where the constant index j has been ommited in the r.h.s. and fij-\-i/2 is 
obtained similarly in j-direction depending on Vij^ij2. We have used = 1/3, 
i.e. up to third order accurate upwind. The same form is used for turbulent 
flows as well, although the formal accuracy is degraded by non-regularity of the 
grid. On the other hand, the grid refinement is needed inside of shear layers, 
where the diffusive term is dominant. There was no need to use a limiter 
in Eq. (7) for momentum equations. In case of turbulence model equations 
a limiter is sometimes necessary to achieve better convergence in the non- 
turbulent regions. We have used a minmod one. 

The approximation of cell face velocity derivatives RW^ , RWy needed in diffu- 
sive terms is central. It uses quadrilateral dual finite volumes constructed over 
each face of primary volume - the vertices are located at end of primary face 
and in centres of adjacent primary volumes. The mid-point rule quadrature 
formula is again used, with face value of velocity defined as average of values 
in vertices of dual cell [11]. 



3.2 Pressure stabilization 

The finite volume is common for all unknowns. In order to avoid pressure- 
velocity decoupling in this collocated arrangement, a pressure diffusion term 
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in the form of Laplacian of pressure is added to the r.h.s. of continuity equa- 
tion. The magnitude of pressure diffusion is adjusted by the local resolution 
of physical diffusion. The term is also essential for robustness of the method 
[ 111 - 

The effect resembles the one in the pressure stabilized Petrov- Galer kin method, 
however in the present formulation the pressure diffusion does not vanish com- 
pletely in steady state. 



3.3 Discretization in time 

The system (1) with R given in (5) is integrated in time by means of explicit 
methods - multi-stage Runge-Kutta methods and MacCormack method - or 
by implicit backward Euler method. In the implicit method, the system of 
algebraic equations is linearized by Newton method, which results in an block 
nine-diagonal system. The system is solved iteratively using a line Gauss-Seidel 
method, applying direct block tridiagonal system inversion on the lines. For 
turbulent flows, solely implicit method was used. 



4 Finite Element Method 

In order to clear up the finite element method approach, we re-write the system 
of the incompressible Navier- Stokes equations in nondimensional form 

^ - lyAu + (u • V)u + Vp = 0, in i? X (0, T), (8) 

V • u = 0, in G X (0, T). 

This system is equipped with suitable boundary conditions of either Dirichlet 
type (inlet and walls) and do-nothing boundary condition (outlet). The initial 
condition for velocity components has to be added. The time derivative of 
velocity components vanishes in the case of relevant stationary solution and 
thus can be ommited. On the other hand we use the time stepping scheme for 
finding the stationary solution. Nevertheless, the precise time stepping scheme 
is needed in the case of nonstationary fluid flow. We follow now with time and 
space discretization. 

For time discretization we use second order implicit schemes, where 

^ 2r 

This leads to a solution of one nonlinear system in each time step. We use the 
second order linearization of the nonlinear convective term, which leads to the 
semiimplicit scheme 
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(u • V)u w [(2u” - u”-i) • V] u”+^ 

This leads to the following scheme 

o„n+l _ I ||n-l 

— i/Au”+i + ((2u” - u”-i) • V) u"+i + Vp”+^ = 0,(9) 

2r ^ 

V • u”+i = 0, 



equipped with appropriate boundary and initial conditions. 

Next, the problem (9) is reformulated in a weak sense, which is suitable for 
the solution with the aid of the finite element method. Defining the velocity 
space X = and the pressure space M = Lq{Q) it is easy to see that 

the solution U = (u,p) of problem (9) satisfies 

a{u, V) = /(y), V y ^ (v, g) G (X, m) (lo) 

where 

a{U,V) = |:(u,v) + i/(Vu,Vv)+ ^((2u”-u"-i) • V)u,v 

- (p, V • v) + (V • u, q ) , 

/W = ^(4u”-u”-\v) (11) 

and by (•, •) we denote the scalar product in the space LP‘{Q). Moreover, we 
require that u satisfies the Dirichlet boundary conditions. The couple (u,p) 
represents a solution on time level n + 1, i.e. := u and p. 

Further, the use of the Galerkin FEM restrict the weak formulation from 
couple of spaces (X, M) to approximate spaces (Xh^Mh), i.e. find Uh € 
(X^, Mh) such that 

a{UH^ y,.) = /(y„), = K, qn) G (X^, Mh) (12) 

The couple (X^, Mh) of finite element spaces should satisfy the BB condi- 
tion, which guarantees the stability of a scheme: there exists a constant c > 0 
such that 

sup ifjT >c||p||, VpeMh. (13) 

We use Taylor- Hood family of finite elements P^+^/P^-approximation. 

The discretization (10) of the convection part may lead to 2nd order ac- 
curacy, but the approximate solution may suffer from spurious oscillations for 
high Reynolds numbers. In order to avoid this drawback, we apply the stabi- 
lization via streamline-diffusion/Petrov-Galerkin technique (see, e.g., [5], [2]). 
In the stabilized problem we consider some additional terms defined by 





Numerical Solution of Flow in Backward Facing Step 601 
^h,n{U, V) = Y^5k + (w ■ V) u + Vp, (w • V)v^ , 

J^KniV) = E ’ (14) 

where the function w stands for w = 2\i^ — and by (•, -)^ we denote the 
scalar product in the space L‘^(K). The parameter Sk is a function of local 
(element) Reynolds number Re^ based on the transport velocity w: 

and we set ^{Re^) = The parameter S* is an additional free parameter. 

The resulting stabilized system then reads 

a{U,,,Vh) + U,n{Uh,Vh)+ rk(y-u^,V-Vh)=nVh)+J^h,n{Vh). (16) 

Kern 

where we used the additional pracZ — dzf -stabilization, for details see, e.g., [2]. 

The space-time discretization leads to the solution of the following system 
of equations repeatedly for each time step 

Su + 2tBp = /, = 0. (17) 

where u G R^^ and p G R^^ are vectors whose components represent degrees 
of freedom defining the velocity u and the pressure p, respectively, 5 is a non- 
singular rth X 7ih matrix and B is an rih x rrih matrix. In practical computations 
we solve this system with the help of iterative methods or - for a smaller sys- 
tem - with the help of advanced direct methods, see, e.g., [6]. UMFPACK is 
a set of routines for solving unsymmetric sparse linear systems. Ax — 6, using 
the Unsymmetric MultiFrontal method written in ANSI/ISO C. 



5 Results 

The Reynolds number is defined using 2/3 of maximum velocity in the inlet 
(i.e. bulk velocity in the laminar case), and step height. 

The figure 1 shows the distance of reattachment point on the lower wall 
from the step. The grid has regular spacing in both directions equal to 0.05. 
The result of finite volume method on coarse grid with double spacing is also 
shown. 

At higher Reynolds numbers, the laminar flow separates on the upper wall 
as well. The results are shown in Fig. 2. This separation is for higher Reynolds 
numbers predicted in better agreement with measurement than the primary 
one in Fig. 1. 
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For Re > 6600, the flow is fully turbulent and two dimensional in the mean. 
The primary separation length becomes independent of Re and according to [7] 
equal 8 step heights. Our numerical results give well higher values 8.83 using 
SST (Shear Stress Transport) turbulence model [9] and 9.05 using sst mC k- 
€ (modifled Chien k-e with shear stress transport) one [10]. Both turbulence 
models are low-Re two-equation ones, first one uses k-co formulation, second 
one k-e. Several profiles of streamwise component of velocity for these models 
are compared in the upper part of Fig. 3. The lower part of same figure shows 
the /c- variable of turbulence models, which could be interpreted as kinetic 
energy of velocity fluctuations. Also the structure of primary separation highly 
depends on turbulence model, as shown in Fig. 4. However, high dependence 
on numerical method (and grid) should be mentioned as well - e.g. the small 
corner vortex for SST model disappears if the limiter in o;-equation is used. 
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Fig. 1. Prediction of primary separation zone 
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Fig. 3. BFS flow with SST and sst mC k-e model, Re = 6667 
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Fig. 4. Detail of streamlines for BFS flow, SST (left) and sst mC k-e model (right), 
Re = 6667 
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Summary. We show the existence of a solution with a periodic integral average 
of the magnetization. This property is generic for both uniaxial as well as cubic 
ferromagnets. Our proof mainly uses properties of time-discrete approximations and 
a fixed point theorem. 



1 Introduction 



In this paper we show the existence of a solution to a hysteresis model of 
bulk ferromagnets established in [12] having time-periodic spatial averages 
of the magnetization. This holds under general assumptions for uniaxial as 
well as for cubic ferromagnets. The model is based on the Brown’s theory 
of micromagnetics [1] which is here enriched by a suitable rate-independent 
dissipation mechanism. The basic assumption is that the transformation of 
the magnetization from one pole to another one requires a certain amount of 
energy. This energy is related to the coercive force He. The rate-independence 
allows for the description of pure hysteresis losses [5] and is well accepted for 
a fairly wide range of frequencies of external magnetic fields. There are also 
other attempts in the literature to build phenomenological rate-independent 
dissipation mechanisms into the models; see e.g. [13]. We assume sufficiently 
slow processes so that the released heat can be put off (hence the process is 
isothermal). The model fully relies on energy principles and it is based on 
the two main requirements, namely, stability (1) and the energy inequality (2). 
Roughly speaking, we say that q = q{t) is a solution process if 



V<?: < I{t,q) +V{q{t),q) and 

I{t,q(t)) +Yar{V,q]s,t) < I{s,q{s)) + dtl{0,q{e)) dO , 



( 1 ) 

( 2 ) 



where I is Gibbs’ stored energy of the system, “Var” stands for the total 
variation, P is a dissipation functional ensuring a rate-independent response 
and s < ^ G [0, T], where [0, T] is the process time interval. The formulation of 
rate-independent evolutionary processes in continuum mechanics by means of 
(1) and (2) appeared in [10]. Its application to problems of rate-independent 
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hysteresis in micromagnetics appeared in [12]. In particular, they proved the 
existence of a solution to a model capturing a virgin magnetization process. 

Not much is known about properties of solutions to (1) and (2). Mielke 
and Theil [9, Th. 7.1] showed the uniqueness of the solution if I is smooth 
and uniformly convex in q. However, these assumptions do not hold in our 
application. As our functional I comes from the convexification of a function 
with multiple minima it is not strictly convex and it is affine along the easy axis. 
Nevertheless, analyzing time-discrete problems corresponding to (1) and (2) we 
can show periodic behavior of the average magnetization. The main tool is the 
Tychonoff fixed point theorem and uniqueness of the average magnetization in 
the time-discrete case. 



2 Model 

2.1 Stored energy and its relaxation 

The theory of rigid ferromagnetic bodies [1] assumes that a magnetization 
m : i? — > E^, describing the state of a body Q C E^, n — 2,3, is subjected 
to the Heisenberg -Weiss constraint^ i.e. has a given (in general, temperature 
dependent) magnitude 

I m{x) I = Ms for almost all x G i? , 

where Mg > 0 is the saturation magnetization^ considered here as a constant 
(since temperature is considered constant, too). 

In the no-exchange formulation, which is valid for large bodies [4], the 
Helmholtz free energy of a rigid ferromagnetic body Q C E’^ consists of two 
parts. The first part is the anisotropy energy (p{m{x)) dx related crystal- 
lographic properties of the ferromagnet. A typical cp : S := {s e k|== 
Mg} E is a nonnegative function vanishing only at a few isolated points on 
S determining directions of easy magnetization^ e.g. at two points for uniaxial 
materials or at six (or eight) for cubic ones. The second part of the Helmholtz 
energy, ^ |Vum(^)pdrr, is the energy of the demagnetizing field Vum self- 
induced by the magnetization m; its potential Um is governed (after neglect of 
many terms in the full Maxwell system) by 

div( - Vum + mxn) =0 in M" , (3) 

where Xf2 the characteristic function of i7. The demagnetizing- 

field energy thus penalizes non- divergence free magnetization vectors. Stan- 
dardly, we will understand (3) in the weak sense, i.e. Um G will be 

called a weak solution to (3) if the integral identity • 

Vv{x)dx = 0 holds for all v G iJ^(E^), where = IT^’^(E’^) denotes 

the Sobolev space of functions from with all first derivatives (in the 
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distributional sense) also in L^(R^). Altogether, the Helmholtz energy E{m)^ 
has the form 



E{m) = [ ip{m{x)) dx + ]- [ dx . (4) 

Jn ^ jRrv 

If the ferromagnetic specimen is exposed to some external magnetic field 
h = h{x), the so-called Zeeman’s energy of interactions between this field 
and magnetization vectors equals to H{m) h(x) • m(x) dx. Finally, the 

following variational principle governs equilibrium configurations: 

' minimize G{m) := E(m) — H(m) 

< = f (p{m{x)) - h{x) • m{x) dx d- ^ [ |Vi^m(^)pdx, 

, subject to (3), {m,Um) G ^ x , 

(5) 

where the introduced notation G stands for Gibbs’ energy and A is the set of 
admissible magnetizations 

A := {m £ R’^); | m[x) \ = Mg for almost all x G f2} . 

As A is not convex we cannot rely on direct methods in proving the existence 
of a solution. In fact, the solution to (5) need not exist in ^ x H^(W^) Due to 
nonconvexity of A weak limits of minimizing sequences of (5) do not necessarily 
live in ^ X iJ^(R^). 

It is, therefore, natural to look for an extension (^relaxation) of our prob- 
lem in which we would properly describe the behavior of (5) along minimizing 
sequences. It is well-known [4] that such relaxation can be achieved by ex- 
tending the Helmholtz energy by continuity on the convex set of the so-called 
Young measures [14] 

E{i/)= f ifi>dx+]- [ lVM(id.j.)(a;)|2da; , (6) 

Jf2 ^ 

where [v9iy]{x) := v{s)ux{ds) and id : R’^ — ^ R’^ is the identity. The 
set of Young measures y{f2;S) C L^(i7; rca(5)) = L^(i7; C(*S'))* is the 
set of all weakly measurable essentially bounded mappings x : i? 

rca(5) = G{Sy such that is a probability Radon measure supported on the 
sphere S for a.a. x G the adjective “weakly measurable” means that v*v 
is Lebesgue measurable for any v G G{S). A natural embedding of a magne- 
tization m G R’^), |m(x)| = Mg, to y{f2;S) is z/ = i{m) defined by 

= ^m(x) with 5s denoting the Dirac measure at 5 G 5. We say that a se- 
quence {i'^}keN C y{f2;S) converges weakly* to z/ if lim/c_^oo(^^^, /) = (^ 7 /) 
for any / G C{S)) or, equally, for any f = g ^ v with g G L^{Q) and 

V G G[S)^ where the tensorial notation means naturally [g<S>v]{x, s) = g{x)v{s). 
From the last fact, we can also say that ^ v weakly* if and only if 
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w*- lim/e-^oo • z/ for all G ^(5). where the weak*-limit is under- 
stood in L°^(i7). Considering the weak* topology on L^(i7; rca(*S)) makes 
y{Q;S) a convex, metrizable compact set containing densely the set of admis- 
sible magnetizations A if embedded via i. 

As shown in [4, 11], a correct relaxation (==natural extension) of (5) is 

{ minimize G{i/) := E{i') — iJ(id*z/) , 
subject to (3) with m = id«z/, (z^, tz) G y{f2;S) x ^ ^ 

The model (7) represents a so-called mesoscopic level model, because a mini- 
mizing Young measure z/ records some, but not full information about spatial 
oscillations of a minimizing sequence of (5) around each “macroscopic” point 
X through volume fractions described as the probability distribution Ux- This 
information makes possible to describe the effective magnetic properties by 
means of the first moment, the “macroscopic” magnetization m = id • z/, and 
moreover seems sufficient for designing a dissipative mechanism in a good 
agreement with experiments, which will be just exploited further. 

2.2 Rate- independent dissipation 

For usual loading regimes and magnetically hard materials, one must consider 
certain dissipation. Our simplified standpoint is that the amount of dissipated 
energy within the phase transformation from one pole to the other can be 
described by a single, phenomenologically given number (of the dimension 
J/m^— Pa) depending on the coercive force Hq. Hence, we need to identify the 
particular poles according to the magnetization vector. Inspired by [8, 10] and 
considering L poles {L = 2 for uniaxial magnets or 6 or 8 for cubic magnets), 
we define a continuous mapping £ : 5 — > where Al '= G R^; > 

0, i = 1,...,L, 6 = !}• In other words, {£i,...,£l} forms a partition 

of unity on S such that £i(s) is equal 1 if 5 is in i-th pole, i.e. 5 G 5 is in 
a neighborhood of z-th easy-magnetization direction. Of course, £(m) in the 
(relative) interior of indicates m in the region where no definite pole is 
specified. Hence £ plays the role of what is often called an order parameter. 
In terms of the mesoscopic microstructure described by the Young measure 
the “mesoscopic” order parameter is naturally defined as 

\=z Av (8) 

where [£• z/](x) £(s)i/x(ds). Thus A is just a continuous extension of the 

mapping m i— > £(m), i.e. if {mk} converges to z/ weakly* in L^(i7; rca(5)), 
then £(m/c) Au weakly* in L°°(i7;R^). 

To described phenomenologically the dissipative energetics, one must pre- 
scribe a {pseudo) potential of dissipative forces ^ as a function of the rate of 
A. For rate-independent processes, this potential must be convex and homo- 
geneous of degree- 1. Considering a (not necessarily Euclidean) norm \ - \l 
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on one can postulate ^(A) = Hc\X\l, a constant i/c > 0 is the so- 
called coercive force. The energy needed to transform z-th pole to j-pole is 
then Hc\ei — ej\L with the unit vector with 1 at the z-th position. The 
state of the specimen i? (at a given time t) will be described by the couple 
q = q{t) = A) = {{ux^t}xen^ Let us denote by Q the convex set of 

admissible configurations: 

Q:= |g = (j/,A)eJ(r2;S')xL~(r2;R^) (9) 

A(a:) £ Al, Av — X for a.a. 

The total dissipation of the process between states , g 2 ^ Q is defined as 

[6, 12] 



T>{qi,q2) ■■= [ Hc\Xi - \2\dx, qi = {i^i, Xi) £ Q , i = 1,2. (10) 

Jn 

For the analysis below, we will need to consider rather a certain regulariza- 
tion of the stored energy S which would control spatial smoothness of A. For 
this, we will augment ^ by a higher-order term 



A) : — F/(z^) 4- 



\ +00 otherwise. 



( 11 ) 



where H^{Q) = denotes the usual Sobolev-Slobodetskh space and 

where we assume 



a, p > 0, fixed. (12) 

From now on, we will work with this regularized relaxed stored energy Ep 
rather than E. 

Let us abbreviate the Gibbs energy by 

g{t,q):=£,{q)-{H{t),q) , (13) 

where 

(H{t),q) = = (i/, /i(-, •) ® id )• (14) 

Let us agree to identify quite naturally the mapping t u{t) = {[i^{t)]x}xef2 
with a Young measure (x,t) Ux^t- 

Definition 1. We say that a process q = q{t) is stable if 

yq£Q-. G{t,q) <g{t,q)+V{q{t),q) (15) 



for all t G [0, T]. 
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Definition 2. We say that the process q = q(t) satisfies the enerqy inequality 
if for alls.te [0,r], s<t, 



G{t,q{t)) +V&i(V,q-,s,t) <g{s,q{s)) - 

' V ' ' V ' ' V ' 

effective Gibbs’ dissipated Gibbs’ ener- 

energy at time t energy gy at time 0 




reduced work of 
external field 



(16) 



where the total variation over the time interval [s,t] is defined standardly, 
without using explicitly any time derivative, as 



Y&i{T>,q\s,t) := sup^D(q'(^i_l),q'(^i)) 

Z=1 

= supY] / d{X{ti^i),X{ti))dx, 

i=i 



(17) 



where the supremum is taken over all J gN and over all partitions of [s, t] in 
the form s = to < ti < ... < < tj — t. 



In what follows “BV” stands for a space of maps with bounded variations. 

Definition 3. The process q — q{t), q = [v, X), will be considered as a solution 
ifiyey{f2x[0,T];S),XeBY{[0,T]]L\f2;R^)) andq{t) e Qforallte [0,T], 
and it is stable in the sense (15) for all t G [0,T] and satisfies the energy 
inequality (16) for a.a. s,t E [0,T], s <t. 

The existence of a response q with the above mentioned properties was 
shown even in a more general case in [12] by a semi-discretization in time, 
using the implicit Euler scheme. For simplicity, let us consider an equidistant 
partition of the time interval [0,T] with a time step r > 0, assuming T/r an 
integer. Even more, we consider a sequence of r’s converging to zero and such 
that, Ti/r^+i is integer, i.e. each next partition is a refinement of the preceding 
one. 

Then we put q^ = a given initial condition, and, for k = \,...,T jr we 
define q^ recursively as a solution of the minimization problem 

( Minimize I{q) := Q(kr,q) -\-V{q^~^ ,q) 
s (18) 

[ subject to q = (iy,X)eQ , 



where Q is from (9), Q is from (13), and V from (10). If a solution (i.e. a global 
minimizer) to (18) is not unique, we just take an arbitrary one for q^. Then 
we define the piecewise constant interpolation qr G L°°(0,T; L^(i7; rca(*S)) x 
L°^(i7;R^)) so that ^r|((/e-i)r,/cr] = Q.T ^ while for t = 0 we 

put gr(0) = go- Besides, assuming 
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h e iT] L\Q- R”)), h{; t + jT) = h{; t) , (19) 

i 6 N, t G [0,T] 

we have certainly Ti G C(Q,T\L^{Q]C{S)) and we can define the piecewise 
constant approximation of 'W, denoted by 'Hr, by Wr(0 “ 'H(kr) for t G 
{{k — l)r, /cr] and by Wr(0 = ^(0) t == 0. Besides, we will still need 
the piecewise affine interpolation, denoted by i.e. is affine in time if 
restricted on the interval [(A: — l)r, kr] for k = 1 ^ ..., T/r and = H{kT) 

for k = 0, ...,T/r. Also, we will naturally assume that the initial condition qo 
is admissible and even stable: 

qoeQ and Sp{qo) < Sp{q) + D{qo, q) + (W(0), qo - q) Vg G Q; (20) 

note that it implies, in particular, that Sp{qo) < +oc. The scheme (18) together 
with a suitable spatial discretization is also suitable for a numerical solution 
to our problem. 

Let us define a sufficiently large set V where the values of all the processes 
qr{-) will certainly live; here it is natural to put 

V := |(z^, A) G Q; |1A1|j/«(|^.]i^l) < Cij; (21) 

the constant C\ can be now considered arbitrary but sufficiently large, and 
will be fixed later, see (22). We will endow V by the (weak*xweak)-topology 
of L"^{Q;ica,{S)) x Clearly, the set V is compact. 

The following can be found in [6] or in a slightly different form in [12]. 

Proposition 1. Let (12), (19) and (20) hold. Let qr = {yr^^r) be a solution 
constructed recursively from solutions to (18) at the prescribed time incre- 
ments. The following a-priori estimates hold: 



||Ar||L^(0,T;rf«(r^;]R^)nL'^(/2;lR^))nBV([0,T];Li(r?;R^) < C'l, (22) 

\Wr\\L°^{0,T;L^(n;rca{S)) < ^2, (23) 

||^t||bV([0,T]) < C's, (24) 

where 0r(O •“ ~ {'bir{t)^qr{l')) denote Gibbs’ energy of the approxi- 

mate trajectory. 

Proposition 1 shows that solutions to (18) live in V if C\ is large enough. 
Let us denote 

p — {(^, A) G X iJ^(C;R^); 3(z/, A) G V such that /3 = z>} , (25) 



where 
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(26) 



0 := [ [ Aiy^ridA) dx = - [ m{x)dx , 

meas U Jq Js Q Jq 

i.e., V is the average of the macroscopic magnetization over the specimen i7, 
in particular \i>\ < Mg. We call i> the Q x S average of v. Endowing P with 
the Euclidean topology of x the weak topology of H^{Q\ R-^) we see that 
V is convex, closed and bounded in the Banach space R’^ x if'^(i?;R^) with 
the norm ||(A = |/?| + We define 

9:V-^P: 0(z/,A) = (z>,A) . (27) 

Lemma 1. Let (12) hold. Let q\ = (z^i,Ai), q 2 = {i' 2 ,^ 2 ) be two solutions of 
(18) in Q for some fixed 1 < k < T/t. Then 0{qi) = 0{q2)- 

Proof The assertion Ai = A 2 follows from the convexity of I from (18) in g 
and the strict convexity of / in A. Note also that I is strictly convex in Vr^(id • ly) , 

i.e., in the magnetostatic field. Denoting mi and m 2 the macroscopic magne- 
tization vectors corresponding to qi and ^ 2 , respectively, the weak formulation 
of (3) gives /j^n(^i(^) — ^2(^))Xi^(^) • Vt>(a:) dx = 0 for any v G As 

for any ic G R^ one can find v G such that \/v = w in f2 we have 

mi{x) dx = m 2 (x) dx and the result follows. □ 

Let Z : P P he defined as follows. For any (/^o,Ao) G P we take 
go = (r'o^Ao) G V such that (3 = uq and solve (18) for all k = 1., . . . .^Tjr. 

Then we calculate and we set Z(/3 o,Aq) (/^r Ar^^), where 

= (j/r Ar^^). Note that the image of (/^o,Aq) does not depend on 
particular (i^o,Ao) G V having z>o = f3o because solutions in (18) depend only 
on Ao in the initial condition. 

Lemma 2. The mapping Z : P ^ 'MA x iJ^(J?;R^) is weakly sequentially 
continuous in R^ x iJ'^(i7;R^). 

Sketch of proof We can suppose that T/r = 1 . Otherwise we write Z 
as a composition of analogously defined mappings from the {k — l)th step 
to the kth step. Having {(5j,\j) — > (/?o,Ao) in P for j 00 we denote 
Fj(q) := g{T,q)+V{q,qj), and Fo(q) := g{T,q) + V{q,qo), q, = {iyj,Xj), 3 > 0- 
We have limj_^cx 5 Fj = F uniformly in V and as Fj are sequentially lower 
semicontinuous on V by [2] any sequence of minimizers of {Fj}j>i contains a 
subsequence converging to a minimizer of Fo • As f? x 5 average of z^-components 
of minimizers of Fq is given uniquely the whole sequence of f2 x S averages 
of zz-components of minimizers of Fj converges to the O x S average of the 
z/-component of minimizers of Fq. Altogether limj_^oo Aj) = Z(/^o,Aq). 
□ 

Lemma 3. There is (/?, A) G F such that Z(/3, A) = (/?, A). 
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Proof. P is a closed, convex and bounded subset of the reflexive Banach 

space X As Z is weakly sequentially continuous in P the as- 
sertion follows from the Tychonoff fixed point theorem; cf. [3]. □ 

The following theorem establishes the existence of a solution to our prob- 
lem, which has the same average magnetization at the time t = 0 and t = T. 

Theorem 1. Let (12) and (19) be valid. Then there are a process 
q = (i/,A) G T(i7 X [0,T];5) x BV{[0,T]; L\Q;R^)), 9{q{0)) = 9{q{T)) and 
a net 0(gr^(O)) = 9{qr^{T)) such that: 

(i) lim^^H' Xr^{t) = \{t) strongly in L^(i7;R^) for all t G [0,T], 

(ii) lim^^s- z^r^(t) = i'{t) weakly* in L^(i7; rca(-S')) for all t G [0,T], 

(iii) Qr^ if, qr^ {t)) = Q{t, q) for all t G [0, T] . 

Moreover, A = Au a.e. on Q for every t G [0,T] and q thus obtained is a solu- 

tion process according to Definition 3 and 9{q{0)) = 9{q{T)). 

Sketch of proof. The proof is similar to the one of [7, Th. 3.4] or 

[6, Prop. 3.13]. The point (i) relies on the weak* compactness of the set 

L^(i7; rca(*S'))[°’^^ due to the Tychonoff ’s theorem on compactness of product 
topologies. □ 

Theorem 2. Let (12) and (19) hold. Then there is a solution process q = 
(z/. A) G y{Q X [0,zT];5') x BY {[0, iT]; L\f2;R^)) such that 9{q{t -|- jT)) = 
9{q{t)) for any t G [0, T] and any l<j<i — l.In particular, m(x, t) dx = 
m(x, t -t- jT) dx. 

Proof. The process q can be constructed by the T-periodic extension of the 
process whose existence was established in Theorem 1. Clearly, the extended 
definitions 1 and 2 hold for such process. □ 
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Summary. A new mixed finite element method for the diffusion equations on gen- 
eral polygonal and polyhedral meshes is presented. The basis vector functions in 
macro cells are designed by solving the local mixed finite element problems with the 
lowest order Raviart-Thomas elements. Numerical results for the Poisson equation 
on distorted prismatic meshes are given. 



1 Introduction 

This paper is a natural generalization of the authors’ recent results [3] to the 
general diffusion equation with mixed type boundary conditions. The paper is 
organized as follows. In Section 2 we give the formulation of the problem and 
describe the meshes to be used. Section 3 is the most important in the paper. 
We describe an approach to designing macroelements in the space of fluxes by 
solving mixed finite element problems with the lowest order Raviart-Thomas 
elements [1], [4] on macrocells. A convergence result is given in Section 4. 
Finally, in Section 5 we present and discuss numerical results for a particular 
3D test problem. 



2 Problem formulation 

We consider the diffusion problem in the form of the system of the first order 
differential equations 



K ^ u + gra,dp — 0 , ^ 

div u d- cp = f ^ ^ 

in a bounded connected polygonal (polyhedral) domain f? in d = 2 (<i = 3), 
with homogeneous boundary conditions 



p = 0 on Fd, 
u ■ n = 0 on Fn- 



( 2 ) 
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Here Fd and F^ are the Dirichlet and the Neumann parts of the boundary 
n is the outward unit normal to K = K{x) is the diffusion tensor, 
K = K^ >0, c = c{x) is a nonnegative function, and / € L 2 (i?). We assume 
that Fd is Si closed subset of df2 consisting of a finite number of segments 
(polygons) in the case d = 2 [d — 3). 

The weak formulation of (1), (2) reads as follows: find 
u eV = {v : V e Hdiv{f2), u • n = 0 on Fjsf}, p e Q = 
such that 



/ ■ 



udx — 



j (V • u)qdx + 




( 3 ) 



for all (u, q) X Q. 

Let f2h be a partitioning of i? into m nonoverlapping polygonal (polyhedral) 
macrocells Ek’ 



m 

= U 

k=l 

and Vh and be finite element subspaces of V and Q, respectively. In this 
paper we assume that the interface Fki = dEk H dEi between macrocells Ek 
and El is either a point, or a segment, or a simply connected polygon in the 
case d = 3 (see, for instance. Fig. 1). We also assume that for any function 
Ph ^ Qh its trace on Ek is a constant, and for any vector-function v G Vh its 
trace on Ek is a piecewise affine vector- function. Moreover, we assume that for 
any vector-function Vh ^Vh its normal component v -fiki on Fki is a constant, 
where flki is the unit normal vector to Fki directed from Ek to Ei, k > 1. An 
example of a polygonal mesh Qh is given in Fig. 1. The arrows on the interfaces 
show the normal component of the fluxes. 

In the case of a triangular (d = 2) or a tetrahedral (d == 3) partitioning 
of Q, Vh is the lowest order Raviart-Thomas space BTo{Qh) subject to the 
boundary condition on F^. 

The mixed finite element approximation to (1), (2) reads as follows: find 
{uh, Ph) ^Vh X Qh such that 




n 

j {V • M/i)gdx 



n 



+ 




( 5 ) 



for all (u, q) G Vh X Qh. The finite element problem results in the system of 
linear algebraic equations 
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with a saddle point matrix 



A = 



/M 

[b -s) 



( 6 ) 

(7) 



where M is a symmetric positive definite matrix and Z" is a symmetric positive 
definite (or semidefinite) matrix. 



3 New finite element space Vh 

Let Z be a macrocell in Qh and dE be a union of s segments, d = 2 (polygons, 
d = 3), Zi, . . ., Zs. For the moment, we assume that any two adjacent segments 
(polygons) Ei and belong to the same line (plane). We recall 

that adjacent interfaces of a macrocell Ek with neighboring macrocells Ei and 
Ej may belong to the same line, d = 2 (plane, d = 3). A possible situation is 
shown in Fig. 2. 

t 

Let Eh = U be a conformal partitioning of E into triangles e^, d = 2 

i=l 

(tetrahedra, d = 3), and BTo{Eh) be the lowest order Raviart-Thomas finite 
element space. Here t is the total number of cells. We consider the mixed finite 
element approximation to system (1) with right hand side f = const £; in E, 
c = 0inE, and with boundary conditions u-ue — on / j, y = 1, s, where 

ue is the outward unit normal to dE: find UE,h ^ RTo(Z^), UE,h • 

on Ej, j = TTs, 

PE,h ^ QE,h = {q ' const on 6i, i = 1, t} 
such that 
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Fig. 2. A macrocell E and adjacent macrocells Ek, Ei, and Ej 




lAl = j dl and \E\ = j dx. 

A E 

Under condition (9) problem (8) has the unique solution wi = UE,h- 

Repeating the same procedure on the same mesh Eh with the boundary 
conditions UE,h ' ~ on / j, 1 < j < 5, z = 2, . . . , 5, we get the vector- 

functions wi, W 2 j . . Ws which satisfy the conditions 



Wi-UE = Sij on Ej, 1 < i,j < s. 

To this end, a vector function 

s 

w = aiWi (10) 

i=l 
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satisfies the boundary conditions 



w ' He — OLi on Fi, i — I, s. (11) 

The extension of the procedure to the case when adjacent Fi and Fj may 
belong to the same line, d — 2^ or the same plane, d = 3, is straightforward. 

To complete the description of the procedure for the designing of vector- 
functions Wi^ i = 1, s, we consider a situation where Fi is partitioned into 
two segments, d = 2, (simply connected polygons, d = 3) Fi^e and Fi^n. We 
assume that any cell from Eh adjacent to the boundary dE satisfies the 
condition that any of its edges, d = 2 (faces, d — 3) belong either to the 
interior of E^ or to Td, or to F^. Then, the requested vector function Wi is 
the solution of the problem: 

find Wi G BTo{Eh), idi • ue = ^ on F{ = Fi \ /V, Wi • ue = 0 on dE \ Fi^ 
PE,h ^ QE,h such that the equations (8) are satisfied for any v G BTo{Eh)^ 
V - fiE — 0 on dE, q G Qe,H’ The constant on the right hand side of (8) is 
defined by the formula 



constE = (12) 

where 

Ihi = j di. 

F 

The extension of the procedure to other Fj,2 < j < s is again straightforward. 

The vector- functions w in (10) with new vector- functions Wi satisfies the 
conditions 



w ' He = OLi on Fi, i = 1, s, 
w ' fiE = on dE n Tjv • 



(13) 



We denote the designed space of finite element vector- functions w{x) for 
a macroelement Ek in Qh by yk,h- The global space Vh C iJdiv(f^) for the 
mesh is defined by 



Vh = {v :v G RTo{Oh), v\Sk € k = l, mj 



(14) 
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4 Convergence 



To prove the convergence result for the mixed finite element problem (5) with 
the designed space Vh we assume that 

- all triangles (tetrahedra) 2 = 1, tk, in the partitionings of Ek, k = 
1, m, are regular shaped [1], 

- the number of triangles (tetrahedra) tk in the partitionings of Ek is 
bounded by a constant independent of /c, i.e., 

max tk < const. (15) 

l<k<m 

Then, the convergence result follows from [3] after some minor modifica- 
tions. 

Theorem 1. Under the assumptions made, the finite element solution {uh, Ph) 
of problem (5) converges to the solution {u, p) of problem (3) in iJdiv(f^) x 
L2(i7), i.e. 

II«A - W||/^div(^2) >0 

(16) 

IK - P\\L 2 (n) — ^ 0 as h 0 

where 



h — max diam^^A:- 

l</c<m 



5 Numerical results 

To illustrate the proposed method by numerical results we consider the Dirich- 
let boundary value problem for the Laplace equation: 

Ap = 0 in i7, , ^ 

p — g on oil 

where i? is a prism in with the rectangular faces orthogonal to the {x, y)- 
plane and the triangular faces parallel to the {x, y)-plane, and a function g is 
the trace on dQ of the test harmonic function p = (x^ — z^) -|- x(p^ — z^). 

Let Qh be a distorted prismatic mesh which is obtained from the reference 
rectangular prismatic mesh by perturbation of vertices along the vertical mesh 
lines. An example of such a distorted macrocell is given in Fig. 3. The choice of 
this mesh is relevant to geophysical applications (basin modeling and reservoir 
simulation) where distorted prismatic meshes are used for discretizations of 
diffusion type equations in layered formations. 
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Fig. 3. Reference (dashed lines) and distorted (solid lines) prismatic macrocell 




Normal component of fluxes: p (x, y, z) = (X^ - z^) + x(y^ - z^), ct=0.3 




Fig. 5. Error for normal component of fluxes in C-norm 






622 Yu. Kuznetsov, S. Repin 



Four discretization methods have been utilized. The numerical results are 
shown in Fig. 4 and Fig. 5. The first one is the nonconforming mixed finite 
element method with the lowest order Raviart-Thomas elements. In fact, the 
finite elements for the space of fluxes are nonconforming only on triangular 
interfaces between distorted prisms where the continuity conditions for the 
normal component of fluxes are imposed in the center of mass of the trian- 
gles. On the quadrilateral interfaces which are orthogonal to {x^ i/)-coordinate 
plane, the normal component of the fluxes are constant. 

The second method which is called ”Cubature” in the figures is a mod- 
ification of the mimetic discretization [5] adopted to the prismatic meshes. 
The third method is called ’’Interpolation”. This is a conforming finite ele- 
ment method based on a modification of the Piola transformation adopted to 
prismatic meshes. 

Finally, the method proposed in this paper is called ’’Const Div”. The 
numerical results are given for the error functions in the discrete maximum 
norms. For the solution function p (called ’’pressure”) we measure the errors 
for the mean values over the macrocells, and for the fluxes we measure the 
mean values of the normal components over the interfaces. 

The numerical results show that all the methods provide the same order 
of accuracy of approximation to the solution of the differential problem (17). 
We recall that the method proposed in this paper can be used on arbitrary 
polygonal and polyhedral meshes. 
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Summary. We present a new semi-discrete central scheme for approximating solu- 
tions of Hamilton- Jacobi equations on unstructured meshes. This scheme extends the 
numerical Hamiltonians of Kurganov et al. to unstructured grids. Similarly to the 
previous works on structured grids, a semi- discrete formulation of central schemes is 
made possible due to estimates of the local speeds of propagation. The consistency 
of the method is obtained following Abgralhs calculations for the consistency of an 
upwind Lax-Friedrichs scheme on unstructured grids. We conclude with comments 
on high-order reconstructions. 



1 Introduction 

We present a new central- upwind scheme for approximating solutions of 
Hamilton- Jacobi (HJ) equations on unstructured grids. These equations are 
of the form 

4>t+H{V(j>)=0, x= {xi,. . .Xd) (1) 

where 4> = t), and with a Hamiltonian, iJ, that depends on Vcf and pos- 

sibly on X and t. Since solutions of (1) may develop discontinuous derivatives 
even when the initial data is smooth, it is generally required to interpret the 
solution of (1) in a suitable weak sense. The corresponding theory, in the form 
of “viscosity solutions” has been significantly developed over the past two 
decades and we refer to [5, 6, 14, 15]. 

The increased understanding of the nature of solutions to HJ equations 
have turned the area of numerical methods for the HJ equations into an active 
research area. Converging first-order methods were introduced by Souganidis 
[19]. The order of accuracy of the methods was increased through an essen- 
tially non-oscillatory (ENO) reconstruction in the upwind schemes of Osher, 
Sethian and Shu [17, 18]. Weighted essentially non-oscillatory (WENO) recon- 
structions, which were first introduced for hyperbolic conservation laws [8, 16], 
were then used by Jiang and Peng [7] to even further increase the accuracy of 
the numerical approximations using a compact reconstruction. Extensions of 
the first-order and ENO upwind methods to unstructured grids were done by 
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Abgrall [1]. The numerical fluxes of Abgrall were combined with WENO recon- 
structions on triangular meshes by Hu and Shu in [20] . Another finite- volume 
scheme on unstructured grids was proposed by Kossioris et al. in [9]. 

While upwind schemes require solving Riemann problems (or at least ap- 
proximating their solutions) on the interfaces between two cells, Godunov- type 
central schemes utilize evolution points that are located away from the discon- 
tinuous interfaces in order to avoid Riemann solvers altogether. Fully-discrete 
central schemes for HJ equations were first introduced by Lin and Tadmor 
in [12, 13], and further improved by Bryson and Levy in [2]. Fully-discrete 
schemes of order greater than two were derived in [3] using central- WENO 
reconstructions. Semi-discrete formulations of central schemes enjoy reduced 
numerical dissipation while keeping track over the local speeds of propagation 
of information that is propagating from the discontinuous interfaces. Second- 
order semi-discrete central schemes for HJ equations were derived by Kurganov 
and Tadmor in [11]. A more accurate estimate of the local speed of propagation 
was then utilized to reduce the numerical dissipation in [10]. The numerical flux 
of Kurganov, Noelle and Petrova, was combined with the WENO reconstruc- 
tions of Jiang and Peng to obtain fifth-order, semi-discrete central schemes for 
multi-dimensional H J equations in [4] . 

In this paper we extend our previous works on semi-discrete central schemes 
for multi- dimensional HJ equations to unstructured grids. Our derivation re- 
sults with a new numerical flux that takes into account information regarding 
the local speeds of propagation of information from the discontinuous inter- 
faces. Similarly to any other central scheme, the evolution points are taken 
away from the interfaces in order to avoid Riemann solvers. This numerical 
Hamiltonian should be viewed as a central version of the numerical Hamilto- 
nian of Abgrall [1]. It can also be viewed as a generalization of the numerical 
Hamiltonians of Kurganov et al. [10, 11] that were derived for Cartesian grids. 
When we assume that the “unstructured” grid is Cartesian (which still is an 
admissible grid in our formulation) the scheme we obtain is different of the 
schemes obtained in [11, 10]. This difference results from a different averaging 
procedure we use with the values obtained at the different evolution points, 
and is reflected by a different coefficient in front of the dissipative term. 



2 A Semi-discrete scheme for HJ equations 

We consider two-dimensional Hamilton- Jacobi equations of the form 

-h H’(V(/)) = 0, X = (xi,X2) G i? C (1) 

augmented with initial values (l){x,t = 0) = (f)o{x). We assume that T is a 
given triangulation of 12. The grid points are denoted by For every grid 
point there are angular sectors Tf" that are ordered counterclockwise. For 
simplicity we will drop the a index from the triangles, and use the notation 
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Fig. 1. Grid point Xa and its angular sectors. Each angular sector I hais an associated 
angle 6i 



Ti = Tf" whenever possible. Each node of our triangulation may be visualized 
as in Figure 1. 

The semi-discrete scheme will be constructed as a limit of a fully-discrete 
scheme as the time-step tends to zero. We therefore assume a time step, At, 
and use the standard notation = nAt. Let denote the approximate value 
of (j){xoc^t'^)- We assume that for each time is given at each node 

of the unstructured grid. We also assume that the values p'^ can be used to 
reconstruct a continuous piecewise-polynomial interpolant, i.e., a polynomial 
in each sector. We will comment below on a method for obtaining such a re- 
construction. In either case, the numerical flux we develop is independent of 
the reconstruction step. We denote this reconstruction by po^ and denote the 
approximation of Vp in each sector Vpoc.h 

The second ingredient we need is an estimation of the maximal speeds of 
propagation at the interfaces of each angular sector (in a direction that is 
perpendicular to the interface). For any given angular sector, Ti, the counter- 
clockwise speed of propagation is denoted by and the speed of propagation 
on the other interface is af . These speeds can be estimated by: 

af = max{maxT, {WH{\/<pa,i) ■ , maxT,_i ■ »;_!,;}} , 

af = max{maxT, • n;+i,;} , maxrj+i {WH(V<pa,i+i) ■ n;+i, (}}(?) 



where Uj^i is the normal vector on the interface between sectors Tj and Ti 
pointing into Ti. 

We can now determine in every sector Ti around Xa an evolution point x^^ 
that is located away from the interfaces (see Figure 2). This will be done using 
the distances that are deflned through the local speeds of propagation a^. 

The distance of the evolution point x\^ from x^ is denoted by di. Clearly, 
di depends on the local speeds of propagation and on the angle 6i and is 



given by 



(a^ At)‘^ -h 2a^ At^ cos 0i -h (a^ At)‘^ 
sin^ 0i 



( 3 ) 
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Fig. 2. Evolution point, derived from the maximal local speeds of propagation 
into Ti , and 



We then define di to be di/At^ a quantity that does not explicitly depend on 
At. 

The interpolant (^(x, t'^) is evolved to the next time step at the points 
which are located away from the propagating discontinuities (assuming 
that the time step At is sufficiently small). From (1), this is given to first 
order in time by the Taylor expansion 

- AtH{V<pixl,n) + 0{At% (4) 

where the approximation of the gradient V(p at is obtained 

from the reconstruction. 

The next step is to combine the values of ip at the different evolution 
points around into one value This is done by writing a convex 

combination with weights s/ > 0 that are yet to be determined: 






n+l 

a 



E rrioi 

1 = 1^1 



(5) 



Using (4), we may express (5) as 

• (oj 

E;=i 

If we now define pi to be the unit vector in the direction of from we 
can use a Taylor expansion in space 

<P(a;LU") = <p(a;a, t”) + dipi ■ V<p(x^, t”) + 

Here by V(^(x[^, we refer to the value of the gradient at that is associated 
with the reconstruction in sector Ti at We may therefore rewrite (6) as the 
fully discrete scheme 



= <^o 



+ £ s, \dipi • vp{xi,n - H{v<p{xi,n)] . (7) 

Z^i=i 



In the limit At 0, (7) becomes a semi-discrete scheme: 
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A 

dt 



<Pa{t) = 



1 



moc 



l = l 



(8) 



where for each /, denotes lim^t^o All that remains is to 

determine the coefficients si in (8). These coefficients will be obtained through 
a consistency condition. The consistency of the scheme implies that if the 
value of the gradient is identical in every sector that surrounds then the 
numerical Hamiltonian should reduce to the differential Hamiltonian. Hence 
we are seeking for coefficients si such that 

rria 

'^sidipi=0. (9) 

1=1 



Such coefficients indeed exist, and we can use, e.g., the results of Abgrall in [1] 
to find them. The observation that was made there was that if fJ^i^i /2 denoted 
a unit vector in a direction that is aligned with the interface between the 
sectors T/ and T^+i, and ii 0i < tt (which is the case with a triangulation, e.g.), 
then 

'^7i+iPi+L=0, (10) 

1=1 



provided that 








(11) 



for any e > 0. In order to incorporate (lO)-(ll) into our framework, we split 
each angle 6i into two parts that are defined as 

— arcsin-4-, (12) 

di 



(see figure 3). 




Fig. 3. The angles around Xa 



The consistency condition (9) is then satisfied if the weights si are defined 



as 
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Si = 



di 



tan 






4- tan 



+^i+i 



(13) 



where Of are given by (12). In our case, the coefficient e will anyhow cancel 
out in the semi-discrete formulation (8), so it can be omitted from (13). 

To summarize, if we define 






tan 






■ tan 






^+1 



then the semi-discrete scheme is given by 






E rric (3i 
1=1 



Ea 



d, 1=1 



pl ■ - 






(14) 



Remarks. 



1. A simple version of the scheme can be obtained by assuming that all the 
speeds of propagation are identical. In this case, the local speeds are re- 
placed by their maximum, i.e., a — max^ja^"^, aj”}. In this case 

sin((9//2)’ 

and the semi-discrete scheme (14) can be written in the simpler form 




ESA sin I ^ 



A 



Pi • - 



sm ■ 



-HiWait)) 



(15) 



If, in addition to the velocities being identical, the angles are also assumed 
to be identical, i.e. ^ V/, then (15) can be further simplified into 



d 

dt 



-j 'me 

^“W = ;;rE 



1=1 L 



sm ^ 






(16) 



2. In [1] Abgrall derived a Lax-Friedrichs-type scheme on triangular meshes. 
In our notation, his scheme is of the form 



7 



dt 



1=1 






-H 






27T 



(17) 

Here pi^i /2 is the unit vector in the direction of the interface between the 
sectors Ti and T/+i, and A-i- 1/2 = tan(^//2)+tan(^/4_i/2). The derivation of 
(17) involved evolution points that were located on the interfaces between 
the sectors. This resulted with the form of the dissipative term in (17) that 
contains averages of gradients in adjacent sectors. Also, the scheme (17) 
involves a Hamiltonian that is evaluated at the average of the derivatives 
that are computed in different sectors (with weights that are proportional 
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to the angles). This term was postulated to be in this form, and could have 
taken different forms. In our case (14), this term is replaced by an average 
over the Hamiltonian that is evaluated in different sectors. In our case, the 
form of this term is dictated by the derivation of the scheme. 

3. We would like to emphasize that the scheme (14) does not require Riemann 
solvers. This is due to the scheme’s derivation, which places the evolution 
points away from the boundaries of the angular sectors around every point 

X(x . 

4. If the grid is a Cartesian grid with equal spacing in the x- and ^/-directions, 
the number of angular sectors at each point is = 4, and sin(^/2) = 
sin(7r/4) = \/2/2. In the simple case where all velocities are taken to be 
identical in both direction, the scheme (16) becomes 

^ {ft - fx + ‘ft - 'Py) (18) 

-^-[H{ft,ft)+H(fr,‘Pt)+H{ft,‘Py)+H{f-,f-)], 

with the obvious notations, e.g. is the Hamiltonian evaluated 

at the gradient at Xa that is taken from the first quadrant. The scheme 
(18) is identical to the consistent and monotone semi-discrete scheme that 
was derived for Cartesian grids in [4, 11]. 

5. The order of accuracy of the scheme (14) is determined by the order of ac- 
curacy of the reconstruction and the order of the ODE solver. Other than 
that, the formulation (14) is independent of the order of accuracy of the 
scheme. The scheme is a Godunov- type scheme with a global reconstruc- 
tion that is evolved in time in evolution points that are located away from 
the interfaces. In practice, the final semi-discrete scheme (14) requires only 
the values of the gradient at the grid points Xa that are computed in the 
different angular sectors around Xa- Hence, all that one needs from the 
reconstruction is the values of these gradients. It is therefore possible to 
use the same reconstructions that were developed for upwind schemes for 
HJ equations on triangular meshes. For examples, a high-order (third- or 
fourth-order) weighted essentially non-oscillatory (WENO) reconstruction 
for HJ equations on triangular grids was derived by Zhang and Shu [20]. 
It can be incorporated as it is into the present framework. 



3 Conclusion 

We have derived a new semi-discrete central scheme for HJ equations on un- 
structured grids. This scheme is a generalization of the semi-discrete central 
schemes on Cartesian grids [4, 10, 11]. It is a Godunov-type scheme where 
a global reconstruction is evolved in time and then projected back to the grid 
points. Since the evolution is performed away from the interfaces of the angu- 
lar sectors, there is no need to use exact or approximate Riemann solvers. The 
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formal accuracy of the scheme depends on the accuracy of the reconstruction 
and the order of the ODE solver being used. 
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Summary. The aim of this contribution is to present the current state of our re- 
search in the field of numerical simulation of dislocations moving in crystalline materi- 
als. The simulation is based on recent theory treating dislocation curves and dipolar 
loops interacting by means of forces of elastic nature and hindered by the lattice 
friction. The motion and interaction of one dislocation curve and one dipolar loop 
placed in 3D space is considered. Equations of motion for a parametrically described 
dislocation curve are discretized by the fiowing finite volume method in space. The 
interaction force is computed for each dipolar loop and along the discretized curve. 
The resulting system of ordinary differential equations is solved by a higher order 
time solver. 



Physical background Plastic deformation of crystalline solids is a result of 
the motion of dislocations. The theory of dislocations is described in num- 
ber of text books, e.g. [8]. Here we recall only the basic mobile properties of 
dislocations and the nature of their mutual interactions. 

A dislocation is a line defect of the crystal lattice. Along the dislocation line 
the regular crystallographic arrangement of atoms is disturbed. The dislocation 
line is represented by a closed curve or a curve ending at the surface of the 
crystal. At low homologous temperatures the dislocations can move only along 
crystallographic planes (the slip planes) with the highest density of atoms. The 
motion results in mutual slipping of the neighboring parts of the crystal along 
the slip planes. The slip displacement carried by a single dislocation, called 
Burgers vector, is equal to one of the vectors connecting the neighboring atoms. 

The displacement field of atoms from their regular crystallographic posi- 
tions around a dislocation line can be treated (except the close vicinity of the 
line) as elastic stress and strain fields. On the other hand, a stress field exerts 
a force on a dislocation. The combination of these two effects causes the elastic 
interaction among dislocations. 

One of the most distinguished features of plastic deformation at the mi- 
croscale is a great overproduction of dislocations during a deformation process. 
Only a small fraction of generated dislocations is needed to carry plastic defor- 
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mation, the rest is stored in the crystal. The deformed crystals supersaturated 
with dislocations tend to decrease the internal energy by mutual screening 
of their elastic fields. If dislocations possess a sufficient maneuverability pro- 
vided by easy cross-slip (solids with wavy slip) the leading mechanism is an 
individual screening. The dislocations are stored in the form of dipoles which 
are transformed to prismatic dislocation dipolar loops of the prevailing edge 
character or such loops are directly formed (the experimental evidence is sum- 
marized in [9]). 

The glide dislocations and the dislocation loops have much different char- 
acteristic length scales and mobile properties: 

— While the segments of glide dislocations extend over distances of microme- 
ters, the size of the prismatic dipolar loops is of the order of 10 nm. 

— The glide dislocations are moved by the shear stress resolved in the slip 
plane, while the loops are drifted by stress gradients and/or swept by the 
glide dislocations. The loops being prismatic they can move along the direc- 
tion parallel to the direction of their Burgers vector only. 

— During deformation, the glide dislocations become curved. The local curva- 
ture of the glide dislocations seems to be one of the leading factors control- 
ling the pattering [11, 12]. The loops can be approximately treated as rigid 
objects. 

Due to the above mentioned complexity the formation of dislocation struc- 
tures as a consequence of the interactions among dislocations is still an open 
problem. In this paper we will be concerned with a particular case: a disloca- 
tion curve interacting with a dipolar loop. 

Dislocation Curve and Dipolar Loop We consider a plane dislocation 
segment with fixed ends; the segment represented by a plane curve can bow in 
a slip plane which is identified with the xz-plane of the coordinate system, i.e. 
y = 0. the dislocation segment approaches a loop, the curve can pass by or 
the curve and the loop start to move together or the curve is stopped by the 
loop [13]. 

As the loop is allowed to move along the direction parallel to its Burgers 
vector b only (see Fig. la) just the force component in that direction causes 
the loop motion. Additionally, the lattice friction acts against the motion. The 
detailed condition for the dislocation curve and the loop is specified in Sect. . 

The position of the loop is represented by 3 coordinates of its center. There 
are two types of dipolar loops: a vacancy dipolar loop and an interstitial dipolar 
loop] each type in two stable configurations [10] (see Fig. 2). 

In vacancy loops a strip of atoms in regular crystallographic positions is 
missing. On the other hand; in interstitial loops an extra strip of atoms is 
added. For that reason the vacancy and interstitial loops produce different 
stress fields. The Burgers vectors of vacancy and interstitial loops have opposite 
directions. This fact we incorporate into our model by using negative value for 
the x-axis component of the Burgers vector for interstitial loop. 
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Fig. 1. Dislocation dynamics problem geometry: (a) Dislocation curve and dipolar 
loop interaction; (b) Dipolar loop geometry 



We denote the dipolar loop types and stable configurations by Vi, V 2 for 
the vacancy loop, and /i, I 2 for the interstitial loop, respectively (see Fig. 2). 
In the mathematical model we represent the dipolar loop as a small rectangle 




Fig. 2. Types and stable configurations of a dipolar loop. Longer sides of dipolar 
loop are parallel to the z-axis and lie in different layers of the atomic lattice. 



with two longer sides parallel to the z-axis and two shorter sides parallel either 
to [1,1,0] or [1, —1, 0] depending on the type of loop. The dimensions of dipolar 
loop are 21 and 2^/2h, respectively, see Fig. lb. 

Stress Field of Dipolar Loop Each type of dipolar loop produces a stress 
field. The formula for this field will be needed in the numerical simulation of 
the dislocation dynamics. In this work we use the stress field aij presented by 
Kroupa et al [6, 7] (using Einstein’s symbolic rule for sums): 
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47t(1 — u) 



//? 



3(1 -2i/) 






hQkJ^nQn 4- (4z/ - l)bkl^k 



^ij 



( 1 ) 



-f(l - 2iy) {biUj H- bjUi) + — [bkQki^iQj + 4- i^kQk{biQj + bjQi)] 



3(1 -2z^), 15, 

4“ 2 bj^U^QiQj ^bj^Qj^jyyiQYiQiQj 



dA. 



In (1) we introduce following symbols {i,j,k,n G {1,2,3}): 

(Jij aij = Gij{x^y^z) — components of the stress field tensor, 
depending on the position in space 
/i shear modulus 
V Poisson’s ratio 

A area of the dipolar loop, with d^l = 2h\/2dt 
bi , bj , bk components of the Burgers vector 
Qij Qj, Qk, Qn components of the relative position vector, pi = x,Q 2 = 
y,Qs = z 

g relative distance from the dipolar loop, g = 

^k^ components of the dipolar loop normal vector 
6ij Kronecker symbol 

The normal unit vector is chosen to be ^[1,1,0] for dipolar loops of type 
Vi and /i, and u = [1, — 1,0] for dipolar loops of type V 2 and I 2 . 



Dipolar Loop and Dislocation Curve Interaction The interaction force 
per unit length of dislocation line is given by the Peach-Koehler equation, 
which written for the i-th component reads: 

fi ~ ^ijk^ jmbm^k ? (2) 



where we denote: 

fi i-th component of the interaction force per the unit length of 
the dislocation line 
£ijk Levi-Civita symbol 

ajm components of the stress field tensor at the dislocation position 
bm components of the Burgers vector 

Sk components of unit vector s which has the direction of the dis- 
location line 

Mathematical Model The dynamics of the system of a dislocation curve 
and a dipolar loop is governed by a system of two equations describing their 
motion. The motion law for the dislocation curve is represented by the well- 
known mean curvature fiow equation (see e.g. [4, 1,3]) 



Bu = F 



( 3 ) 
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where v is the normal velocity of evolving curve, k its curvature, B (drag 
coefficient) is a constant given by material and F represents external driving 
force. 

The moving dislocation curve Ft can be parameterized by a smooth time 
dependent vector function X : / x 5 ^ i.e., at any time t it is given as the 

Image(X(., t)) = {X(it,t),ix G S} where 5 is a fixed parametrization interval 
and / is a time interval. For a smooth curve \'Xu\ > 0 and for unit arc-length 
parametrization s, ds = |X^^|dix. Then X5 and X^ represent unit tangent and 
normal vectors, respectively. Using Frenet’s formulae, the evolution equation 
(3) can be rewritten to the form of intrinsic diffusion equation [4, 1, 3, 5] 

BXt = Xss + FXj (4) 

for the position vector X. If we denote by s) and s) the components 
of the dislocation curve position vector in the xz- plane, then X^ = [Xf , — XJ ]. 
The equation (4) will be solved numerically to model a complicated dislocation 
curve dynamics. 

Now we must explain exactly what is covered by the term F in (4). We 
know that F incorporates the interaction between the dislocation curve and 
the dipolar loop. To get detailed knowledge of F, we must go back to the 
Peach-Koehler equation (2). Assuming the dislocation curve can move only 
in the xz-plane and the dipolar loop can glide along the x-axis, we need to 
put fx and fz from the Peach-Koehler equation into the governing equations. 
Denoting the Burgers vectors of the dislocation curve and the dipolar loop 
hcurve = [^cur4;e,0,0] and b = [6,0,0], respectively, we get 

fx — ^xy^curve^z ^ fz — ^xy^curve^x 7 ( 5 ) 

where axy = cri 2 , and 5^, Sz are the components of the dislocation curve’s 
tangential vector, which can be also written as 

= . ( 6 ) 

Thus, the term F = CText^curve + CTxybcurve covers the stress of the dipolar 
loop exerted on the dislocation curve, as well as any external stresses which 
may the material be exposed to. To be more precise, in order to obtain force 
vector acting at given position of the dislocation curve, one needs to multiply 
F and the dislocation curve’s normal vector at that position. We also 
explicitly write the dependence of F on the dislocation curve Ft because the 
curve position is required for the evaluation of its relative position to the 
dipolar loop. The obtained relative position is then used in the evaluation 
of O' xy ' 

The stress Oxy is given by (1), but we can simplify this formula for our 
specific situation - Burgers vector has only one non-zero component and the 
dipolar loop is a rectangle which has one of the two possible configurations. 
Under the assumption that the dipolar loop parameter h is very small, we can 
use Taylor expansion in (1) and integrate it to obtain an algebraic formula for 
the stress: 
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where 



= \/ a :2 + 2/2 + (/ - 2)2 , = ->/ x 2 + 2/2 + (/ + 2^)2 



(8) 



With the upper sign in (7) we get the stress formula for the dipolar loops Vi 
and /i, while with the lower sign we get the formula for V 2 and I 2 dipolar 
loops. 

In order to obtain the equation governing the motion of the dipolar loop we 
have to sum all the stress contributions of the dislocation curve. It is enough 
to consider only the contributions in the x-axis direction because it is the only 
direction the dipolar loop is allowed to glide in. 

The stress contribution of the dislocation curve can be obtained according 
to the action-reaction principle by simple reversing the sign of fx in (5) and 
integrating along the dislocation curve: 

Fx ~ ^ xy^curve^x^^ i (9) 

Ft 

where Ux is the x-axis component of the dislocation curve element normal 
vector. Note it can be replaced using the derivatives of X with respect to the 
parametrization s since it holds rix = X^. Besides there is one other kind 
of force — the friction force Fq which is a constant given by the material 
and which acts against the gliding of the dipolar loop. Giving all the above 
information together, we come to the equation governing the gliding of the 
dipolar loop: 

dx 1 

— ~^p Fx, total {Ft J x{t)) , (10) 

where x{t) is the x-axis position of the dipolar loop, P = 4(/ + \/2h) is the 
perimeter of the dipolar loop, and 



F x,total{Ft') ^(^)) 



' F^,-F^iiF-,>Fo 

<0 ii\F^\<F^ 

,F,^ + FoifF^ <-Fo. 



( 11 ) 



The complete dislocation dynamics problem for one dislocation curve and one 
dipolar loop then follows when we put (4) and (10) together with initial and 
boundary conditions. 
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Numerical Scheme For discretization of the problem described earlier, we 
use the flowing finite volume method [5] in space and the method of lines [2] 
in time. Discrete solution is represented by a moving polygon given, at any 
time by plane points X^, i — 0, ..., M. The values Xq and Xm are prescribed 
in case of fixed ends of the curve. The segments [Xi_i,Xi] are called flowing 



finite volumes. We construct also dual volumes Vi 



i = 1, .., M — 1, where X^_i = 



i+X, 






Xi,X,,i 



(see Fig. 3). 




Fig. 3. Piecewise linear approximation of the dislocation curve 



Integrating evolution equation (4) in dual volume Vi we obtain 

^2^ /* / c»v \ -L 

/Vi JVi 

Then we simply get 
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di + di^i dX^ 
2 dt 



dX 
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(12) 



(13) 



where 

di = |Xi - Xi_i| = ^(Xf - Xf_i)2 + (Xf - Xf_i)2 (14) 

and Fi is a constant approximation of F in dual volume V^, == crxy{Ili)bcurve, 

where Ri = X^ — [x(t),y^z] is the relative positional vector of X^ and the 
dipolar loop center. If we replace the terms on the right-hand side by finite 
differences and averaged values, respectively, we end up with the system of 
ordinary differential equations 



B 



dXj 

dt 



did-di^i \ 






-X, X,-X,_1 



di 



-f 



did-di^i 

i = 1 



V-L 






_L 



2 

,M-1 



(15) 



In discretization of the governing equation (9) for the dipolar loop we sum 
contributions of every curve segment to obtain 

M-l 

= E ^curve (X4i - Xf ) , (16) 

i=0 

where [x{t),y, z] is the center of the dipolar loop at time t. Next, we use formula 
(11) applied to discrete FJ defined above and get 
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The complete discrete problem consists of (15) and (17) with accompanying 
initial and boundary conditions. 

Results of Numerical Experiments We made several numerical simula- 
tions in which we used different settings. For the basic physical parameters we 
used values which were experimentally measured for nickel crystals at room 
temperature [10]: average length and width of the dipolar loop I = 60 nm, 
h = A nm, Burgers vector 6 = 0.26 nm, shear modulus fj, = 80 GPa, and Pois- 
son’s ratio 1 / = 0.33. When not specified otherwise, we used drag coefficient 
B = 10“^ Pa s. 

In the simulations we were changing not only the type and initial position 
of the dipolar loop, but also the initial shape of the dislocation curve and the 
value of friction force. We observed following facts (not all of them can be 
demostrated here): 

— For the dislocation curve with fixed ends the curvature acts against the ex- 
ternal stress. Therefore, there exists some equilibrium shape the dislocation 
curve tends to. 

— When no external stress is applied, the dislocation curve of any initial shape 
tends to the straight line (potential energy minimization). Adding dipolar 
loop, an oscillating motion (Fig. 4) of the dipolar loop as well as the dislo- 
cation curve can occur. 

— The direction in which the dipolar loop leaves the dislocation curve’s inter- 
action region depends on the type of the dipolar loop. Simply, V 2 shifts to 
the left where Vi shifts to the right. 
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Fig. 4. Dipolar loop oscillations. Dipolar loop of type Vi starts to glide to the left 
of the dislocation curve (timelevels T = 15.02, T — 25.022, T = 35.028). Then it 
reverses as the attractive force of the dislocation curve gains the control over the 
system for some time (T == 45.048, T = 55.068). Second reversing occurs before 
T = 63.084. 
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T = 00.0000 T = 00.2000 T = 00.6005 
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ooooooooooo ooooooooooo ooooooooooo 
ooooo ooooo ooooo ooooo ooooo ooooo 
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CMCv|i-i-' T-T-C\iC\l CMCMt-t-' t-t-C\JCM t-i-C\JC\I 

Fig. 5. Dipolar loop swept by the curve; on the other hand, the curve is distorted 
by the stress field of the loop. In this test there were used: /i = 80 GPa, v = 0.33, 
B = 10“"^ Pa s, 6 = 0.707 nm, I = 35 nm, h = V2 nm, Fo = 4 MPa m, applied stress 
(7a = —1.155 MPa. Initial position of dipolar loop was [0,40,-30]. The subsequent 
stages are shown for increasing time T. 
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Summary. The flux-corrected transport (FCT) methodology is generalized to im- 
plicit flnite element schemes and applied to the Euler equations of gas dynamics. 
The underlying low-order scheme is constructed by applying scalar artiflcial viscos- 
ity proportional to the spectral radius of the cumulative Roe matrix. All conservative 
matrix manipulations are performed edge-by-edge which leads to an efficient algo- 
rithm for the matrix assembly. The outer defect correction loop is equipped with 
a block-diagonal preconditioner so as to decouple the discretized Euler equations 
and solve all equations individually. As an alternative, a strongly coupled solution 
strategy is investigated in the context of stationary problems which call for large 
time steps. 



1 Introduction 

The concepts of flux-corrected transport can be traced back to the celebrated 
SHASTA scheme proposed by Boris and Book [1] in the early 1970s. Later, 
their algorithm was superseded by Zalesak’s multidimensional limiter [11] and 
carried over to flnite elements by Lohner and his coworkers [7]. 

In recent publications [3], [4], [5] we presented a generalization of this ap- 
proach to implicit flnite element discretizations. A notable benefit of the new 
FEM-FCT formulation was the representation of anti- / diffusive terms as sums 
of skew-symmetric internodal fluxes. Moreover, an iterative limiting strategy 
was introduced which prevents the limiter from getting overly diffusive for 
large time steps. In this paper, we concentrate on flux correction for the Euler 
equations of gas dynamics and discuss various algorithmic aspects pertinent 
to the treatment of nonlinear hyperbolic systems. 



2 FEM-FCT for scalar equations 

As a model problem, consider the generic conservation law + V • (vu) = 0 
where v == v(x, t) is a nonuniform velocity field. Let us employ standard 
Galerkin FEM for the discretization in space and interpolate the convective 
fluxes in much the same way as the sought solution [2]. After mass lumping, 
we get an ODE system given by 
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d”zx ’ I ^ 

— Ku TTli ^ ^ ^iji.^j ^i) d" ^i'^i’) ^i — ^ ^ kij (1) 

3 

where Ml = diaglm^} denotes the dumped’ mass matrix and K = {kij} 
stands for the discrete transport operator. The second expression in (1) cor- 
responds to a single node i where the first term in the right-hand side is en- 
gendered by the incompressible part of the transport operator. Provided that 
kij > 0, VJ ^ z, a system of such form is called local extremum diminishing in 
the absence of the term SiUi which vanishes for divergence- free velocity fields 
and is responsible for the physical growth of local extrema otherwise. 

One major ingredient of any FCT algorithm is the nonoscillatory positivity- 
preserving low-order scheme which can be constructed by applying artificial 
diffusion D = {dij} so as to render all off-diagonal entries of the linear operator 
L = K + D nonnegative [3]. Hence, the optimal diffusion coefficients are given 
by 

dij ~ dji = in.3-x{0, kij ^ da = ^ ^ djj ■ (2) 

Since discrete diffusion operators have zero row/column sums [3] the diffusive 
term can be decomposed into skew-symmetric fluxes {Du)i = fiji where 

fij = dij{uj — Ui). Thus, the modifications in (2) are conservative. Starting 
with the Galerkin operator L \= the low-order operator can be constructed 
by applying artificial diffusion edge-by-edge for each pair of nodes i and j 
whose basis functions have overlapping support 

la la dij ^ lij I— lij -f" dij ^ 

Iji . — Iji -|- dij , Ijj . — Ijj dij 

The discrete unwinding technique presented above carries over to multidi- 
mensions and yields the least diffusive linear LED scheme. However, linear 
monotonicity-preserving methods can be at most first-order accurate. As a con- 
sequence, compensating antidiffusion which constitutes the difference between 
the discretizations of high and low order 

Pi — fiji fij = — (^ij~^ + dij^ (uj — Ui)^ fji — — fij (4) 

is to be constructed in a nonlinear way so as to remove the excessive artificial 
diffusion. 

After an implicit time discretization we obtain a nonlinear algebraic system 

n+l _ . n 

Ml + (1 - e)Lu^ + 0 < 6> < 1 (5) 

which can be solved via the fixed-point defect correction scheme 
^("^+1) = + A- 



( 3 ) 



m = 0, 1, 2, (6) 
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Here, ^4 is a ‘suitable’ preconditioner and is the defect 

vector for the m-th iteration cycle. The latter incorporates the constant right- 
hand side stemming from the low-order scheme plus compensating antidiffusion 

bim+i) where = [Ml + {1 - 9)AtL]u^ . (7) 

Varying the correction factors aij between zero and unity, one can blend the 
high-order method with the concomitant low-order one. The latter should be 
used in the vicinity of steep gradients where spurious oscillations are likely 
to arise. The construction of the solution-dependent correction factors and 
of the fully discretized antidiffusive fluxes is elucidated in [5]. As an 

alternative to (7), an iterative limiting strategy was proposed for implicit FEM- 
FCT schemes operated at large time steps. Roughly speaking, the amount 
of previously accepted antidiffusion is taken into account so that only the 
rejected portion of the antidiffusive flux needs to be limited at subsequent 
defect correction steps. 



3 Euler equations 

Compressible flows are governed by the Euler equations which represent a sys- 
tem of conservation laws for the mass, momentum and energy of an inviscid 
fluid. These hyperbolic PDFs are typically written in divergence form 

— +V-F = 0. where M 

d=l 

The vector of conservative variables U and the triple of fluxes F = (F^, 
for each direction of the Cartesian coordinate system are defined as follows 





’ p ' 




pv 


u = 


pv 


, F = 


pv (g) V + p J 




pF 




pHw 



Here, p, v, p, E and H = E p/p stand for the density, velocity, pressure, 
total energy per unit mass and stagnation enthalpy, respectively. This system 
is completed by an equation of state p = (7 — l)p(F— |vp/2), where 7 = Cp/cy 
denotes the ratio of speciflc heats for a polytropic gas (7 = 1.4 for air). 

By application of the chain rule, the Euler equations can be written in 
an equivalent quasi-linear formulation in terms of the Jacobian matrices A = 

f +A.VJ/ = «. where = 



( 10 ) 




644 M. Moller et al. 



4 Galerkin matrix assembly 

In what follows, an efficient edge-based assembly technique for the standard 
Galerkin discretization of the Euler equations is presented. Let us start with 
the divergence form (8) and interpolate the fluxes using the group flnite ele- 
ment formulation [2] which yields an ODE system similar to (1) 

Mc^ = Kv. (11) 

Here, Me denotes the block-diagonal consistent mass matrix for the coupled 
system and K is a, discrete counterpart of the operator — A • V for the quasi- 
linear formulation (10). Let the entries of the mass matrix and the vector of 
coefficients coming from the discretization of space derivatives be defined as 
follows 

rriij = / (fiipj dx, Cij = / (piVepj dx. (12) 

Jn Jf2 

As long as the mesh is fixed, the coefficients Cij remain constant and thus the 
operator K can be assembled efficiently without resorting to a costly numerical 
integration. 

Recall that basis functions sum to unity, so that the sum of their derivatives 
vanishes. Hence, the coefficients Cij satisfy Cu = — ^ij right-hand 

side of the five coupled equations for node i is given by 

{Kv)i = Cii • = - E - F*)- (13) 

3 

In his pioneering work on approximate Riemann solvers [9], Roe showed that 
the differences between the components of F and U are related by Fj — F^ = 
Aij{Uj — Ui), where the triple of matrices Aij = ( AL , , A? • ) corresponds 

to the Jacobian tensor A evaluated for the special set of density- averaged 
variables 



Pij — y/ PiPj ? 



+ y/Pj 






(14) 



This enables us to express the nodal value {K\j)i in terms of the conservative 
variables 



3 

{Kv)i = - E ^ij • Ay (Uj - Ui), where cy • Ay = E CyA^j- (15) 

jjH d=l 



The dot product can be interpreted as a ‘projection’ of the triple Aij onto the 
numerical edge ij. For our purposes, it is expedient to introduce the splitting 
Cij • Aij = —{Aij -h Bij), where the two components of the cumulative Roe 
matrix are defined by [5] 
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— Q-ij • -^iji ^ij — 2 ^ 

Bi,=byAy, by=-£il^. (17) 

A similar decomposition can be performed for the contribution of the edge ij 
to {Kv)j 

j “■ ^ji ' tJi), whorO ^ji * ^-ij ^ ^ij ^ij‘ 

Integration by parts reveals that in the interior of the domain = —Cij while 
hij — 0 [5], so that only the skew- symmetric part A{j needs to be evaluated 
for interior edges. The symmetric part Bij only applies to the cumulative Roe 
matrices for boundary edges. According to (15)-(18), the contribution of the 
edge ij to the term K\J reads 

(Ay +By)(Uj -Ui) > {Kv)i, (19) 

(Ay - By)(Uj - Ui) — ^ {KV)j. (20) 

Together with the fact that the coefficients Cij remain constant and thus can 
be assembled and stored once and for all during the initialization process, 
(19)-(20) suggest an efficient edge-based algorithm for the matrix assembly. 
The underlying data structure can be generated from the sparsity pattern of 
the finite element matrix and contains entries for all pairs of nodes whose 
basis functions have overlapping supports [5]. In contrast to the scalar case 
this connectivity exists not only between basis functions for different nodes 
but also between those for different variables. Hence, each coefficient of the 
discrete operator is given by a square matrix of dimension equal to the number 
of variables. 

It can be readily inferred from (19)-(20) that the contribution of the nu- 
merical edge ij to the global matrix K G *g l^y 



— ^ij ^ij 5 — Aij -j- , 

— A{j “h B^jf , Fjj — A'ij Bjj . 



( 21 ) 



These local Jacobians are evaluated edge- by-edge and their entries K^j, k^l = 
1, . . . , 5 are scattered to the corresponding positions in the 25 blocks Kki G 



[5]. 



5 Artificial viscosity 

To a large extent, the ability of a FEM-FCT algorithm to withstand the forma- 
tion of wiggles depends on the quality of the underlying low-order method. For 
scalar transport equations we derived the least diffusive positivity-preserving 
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scheme by elimination of negative off-diagonal entries from the discrete trans- 
port operator. 

In [5] the LED-principle was generalized to hyperbolic systems by render- 
ing all off-diagonal matrix blocks positive semi- definite. If we perform mass 
lumping and replace the high-order operator K in (11) by the low-order one 
we obtain the ODE system 



du ^ij 5 

Ml— = I/U, where (22) 

Lji — “h ^ij 5 Ljjf = Aij ^ij • 

The low-order operator L is constructed in much the same way as (21) by 
applying tensorial artificial viscosity Dij G to the Roe matrices. The 

global matrix assembly can be adopted from the previous section. The missing 
symmetric boundary part Bij of the cumulative Roe matrix is incorporated 
into the raw antidiffusive fluxes 

Fy = - (^My ^ + Dy - By) {Uj -Vi), Fji = -Fy , (23) 

where Mij = rriijl denotes the local diagonal mass matrix. 

The hyperbolicity of the Euler equations implies that any linear combi- 
nation of the three Jacobian matrices is diagonalizable with real eigenvalues, 
such that the cumulative Jacobian matrix admits the following factorization 

Kij = R(a.^ji )yl(a.^ji )R(a.2ji ) • 

Let us ‘project’ the density- averaged velocity Yij onto the numerical edge ij 
and define the local speed of sound as follows 



\^ij\ 







(25) 



Here, \a.ij\ denotes the Euclidean norm of the coefficient vector aij. As a con- 
sequence, the diagonal matrix of eigenvalues can be readily computed as 

5 ^ij ? ^ij ? '^ij ? ^ij "b Cij } . 

In [5] we gave a detailed description of how to derive a generalization of Roe’s 
approximate Riemann solver from (24) by elimination of negative eigenvalues. 

A much cheaper alternative is to add scalar dissipation proportional to 
the spectral radius of the Roe matrix dij = |aij|(|u^j| -f cij) [5]. The result- 
ing artificial viscosity operator which in fact is the same for all 

components, needs to be applied only to the five diagonal blocks of the finite 
element matrix. Numerical examples demonstrate, that in the framework of 
flux correction the final solution even benefits from this slightly overdiffusive 
low-order scheme because of an improvement in the phase accuracy [4] , so that 
the application of a costly Riemann solver does not pay off. 




Implicit FEM-FCT algorithm for compressible flows 647 



6 Defect correction 



After an implicit time discretization we obtain a nonlinear algebraic system 
similar to (5) which can also be solved by the defect correction scheme 



u(m+l) _ y(m) 



g(m+l) _ 



In a practical implementation the ‘inversion’ of A is performed by applying 
some inner iteration to solve the linear subproblem for the solution increment 
(an improvement of the residual by 1-2 digits suffices) and update the last 
iterate thereafter 






u(o) 



The matrix in the left-hand side of this linear system can be replaced by 
a (block-diagonal) preconditioner , so as to decouple the discretized Euler 
equations [5]. As an alternative we can apply the (preconditioned) BiCGSTAB 
method directly to the coupled system (28). 

Let us split the low-order operator into its diagonal, subdiagonal and su- 
perdiagonal parts = FFC’^) + j(^) + In [5], a block- Jacobi precon- 
ditioner was suggested for the defect correction scheme, so that 

only the five diagonal blocks need to be assembled and stored 



d™) = 4™^ = Ml- Vfc 



cif = 0, VZ / k. 



As a consequence, the linear system (28) resolves into a sequence of scalar 
subproblems which can be solved separately or at best in parallel. For this 
purpose an iterative method, e.g. (preconditioned) BiCGSTAB or geometric 
multigrid, is applied to 



jM A M _ Jrr 



/c = 1, . . . , 5 



^k —^k 



However, this segregated solution approach disqualifies for larger time steps 
since severe convergence problems of the outer iteration can be observed. 

Longing for the full potential of the iterative limiter and the unconditional 
positivity of fully implicit time-stepping we have been testing some coupled 
solution strategies for (28) by means of a preconditioned BiCGSTAB method. 
In this case, additional blocks of the low-order operator may need to be as- 
sembled and stored or there should be another (direct) way to carry out the 
matrix- vector multiplications for updating the residual. 

Let the preconditioner for the BiCGSTAB solver be given by 
where 

= ML-eAtL^^\'^l <k, and = 0,VI > k, (31) 

= ML-GAtL^^\yi> k, and = 0,\/l < k. (32) 
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This corresponds to the block- Gauss- Seidel scheme 

j{rr^) — j^(^) _ (33) 

Swapping the sub- and superdiagonal block-matrices in the equation above 
gives another variation of this algorithm. The alternating application of both 
subdiagonal and superdiagonal block-matrices results in a symmetric block- 
Gauss-Seidel approach. Recall that in equation (33) is a block-diagonal 
matrix for which each block corresponds to a scalar problem similar to (30). 
They can be solved in much the same way as for the segregated solution 
approach. The design of ‘optimal’ preconditioners is a nontrivial task which 
constitutes an important field for future research. 



7 Numerical examples 

Let us illustrate the potential of the implicit FEM-FCT algorithm by con- 
sidering a steady two-dimensional supersonic flow over a wedge. Here, the 
free-stream Mach number is M = 2.5 and the deflection angle is ^ = 15°. The 
results presented at the top of Figure 1 are computed on a mesh of 128 x 128 
bilinear elements by the low-order scheme (left) and the implicit FEM-FCT 
algorithm (right), respectively. It can be readily seen that the shock wave is 
unacceptably smeared by the low-order method. Nevertheless, both the up- 
stream and downstream Mach numbers are predicted correctly. The iterative 
flux limiter resolves the shock very precisely within as few as 3-4 elements. 




To demonstrate the potential ability of our discrete FEM-FCT approach 
to deal with unstructured meshes, we constructed an adaptive coarse grid with 
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the grid points clustered in the vicinity of the shock wave, see Figure 1 (bottom 
left). After two steps of global refinement this gives the computational mesh 
of approximately 10,000 vertices. The resulting numerical solution (bottom 
right) exhibits superb accuracy and remains absolutely free of oscillations. 

Numerical results for a variety of standard gas dynamic test cases encom- 
passing both transient and stationary flows are presented in [5]. Moreover, an 
in-depth investigation of scalar problems can be found in the same publication. 



8 Conclusions 

To our knowledge, most of the finite element schemes for solving the Eu- 
ler equations on unstructured grids are explicit and, consequently, subject to 
an restrictive CFL condition. In this paper, an implicit high-resolution finite 
element scheme for hyperbolic systems was presented making use of the flux- 
corrected transport paradigm. The underlying low-order operator was con- 
structed by applying scalar artificial viscosity proportional to the spectral ra- 
dius of the cumulative Roe matrix for each edge of the sparsity graph. An 
efficient edge-based approach to matrix assembly was proposed. The design 
of suitable preconditioners for both segregated and coupled solution proce- 
dures was addressed. The performance of the new FEM-FCT algorithm was 
illustrated for a steady supersonic flow without and with adaptive mesh re- 
finement. The development of robust and efficient iterative solvers for implicit 
FEM-FCT schemes including FAS-FMG multigrid [6] , [8] and an analog of the 
local MPSC smoother for the incompressible Navier- Stokes equations [10] will 
be addressed in forthcoming publications. 
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Summary. This paper proposes an approximation scheme to a classical one-phase 
Stefan problem. The scheme is constructed to represent a singular limit of certain 
reaction-diffusion system approximating to the Stefan problem. To this end, a time- 
discrete operator-splitting methodology is used. Numerical experiments demonstrate 
that the scheme would be useful in practical computations. 



1 Introduction 

A classical one-phase Stefan problem is one of mathematical models which 
describe the melting of a body of ice maintained at temperature OX). One of 
attractive and important subjects of research on this problem is the analy- 
sis of interface between ice and water. The interface is hyper surface and the 
topological structure on ice and water regions may change. This fact induces 
numerical difficulties, and many numerical schemes are proposed to track the 
interfaces [1, 5, 8, 12, 13]. 

The aim of this paper is to propose an approximation scheme to the follow- 
ing one-phase Stefan problem, which is formulate by Eymard et al. [2, 4, 6]: 

{ wt = A{W^) in Qt, 

w'^ = A on dQ X (0,T), 

w{x^0) = wo{x) for X G 12, 

where 1? is a bounded domain in (d G N) with smooth boundary 
T is a positive number, Qp — 12 x (0,T], A and wq are given functions and 
— max{dba,0}. The temperature of water is indicated by u := w~^ , and 
V w~ is the function such that the support of v coincides with the mushy 
region, which have been studied by Bertsch et al. [2]. The ice- water interface 
is given by r(t) \= dQ^{t) fl 12, where 12+ (t) = {x e Q \ u{x,t) > 0}. 

The idea of our scheme is a time-discrete operator-splitting methodology to 
a reaction-diffusion equation which is given by Eymard et al. We also introduce 
a singular limit solution to construct an approximation scheme to Problem 
(5P). The advantages of our numerical method to our approximation scheme 



are 
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1. It has low computational cost, 

2. It may be used on arbitrary geometries on i7, 

3. There is no artificial parameters, 

4. We can track the interface even in high- dimensional problems, 

5. Topological changes and complicated interfacial shapes can be handled 
easily. 

In this paper, the convergence of the scheme is shown by numerical ex- 
periments in one-dimensional case. Quite recently, we propose a similar ap- 
proximation scheme to moving boundary problems (including classical Stefan 
problems), and prove the convergence of the scheme [10]. We remark that 
our approximation scheme or numerical method is similar to the diffusion- 
generated approach for the mean curvature flow [3, 9] or the threshold com- 
petition dynamics method [7]. 

At the end of this section, we explain the detail of initial function wq. The 
classical one-phase Stefan problem is often described by 



Ut = Au 



du 



\u = A 



u{x^ 0) = Uq{x) 
^ 0(0) = I7o, 



in Uo<t<T ^{t) X {t}, 
on Uo<t<T r{t) X {t}, 

on Uo<t<T r{t) X {t}, 
on dO X (0, T], 
for X G I2o, 



( 1 ) 



where 0{t) C i? is an unknown domain which describes the water region, 
r(t) = dQ(t) n 17, A is the latent heat, V^, is the normal speed of the interface 
r{t), u stands for the exterior unit normal vector to r{t), uq is a positive 
function representing the initial heat distribution, 17 q is an initial water region. 
We note that Problem (SP) is eq uivalent to Problem (1) if the interface r{t) 
is smooth surface such that Pit) C 17 for all t G (0,Tj, and vary smoothly 
with t, the function smooth up to P{t) for t G (0,T), and 



^ 0 ( 0 .) = I -oW- 



if X G I7o, 
otherwise 



(see [4]). Our numerical simulations are made by using (2). 



( 2 ) 



2 Our scheme 

First of all, we introduce our approximation scheme to the solution of Problem 
(SP). 

Approximation Scheme 1. Let N be a positive integer and 0 = to < t\ < 
■ ■ ■ < tjn < ■ ■ ■ < tN = T . An approximation w™ to w{tm) is defined by 
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— IC{tm) O JC{tm-l) O • • • O JC{ti)wQ, 



where JC(t) is an operator with domain L^[Q) defined by 

JC{t)z := limit) - z~ for tm < t < tm-\-i (m = 0,1,2,..., 1), 



and limit) z denotes the solution of the following heat equation 



{ Uf, — in if? X (^rn 5 ^m+l] 5 

u — A on X (^m 5 ^m+l] ) (^) 

u{x,tm) = z{x) for X e f2. 

Most of computational cost of Approximation Scheme 1 is spent on the 
heat equation (3), and the first advantage ‘1. Our numerical method has low 
computational cost’ shown in Section 1 follows. If we use a finite element 
method or finite volume method to solve (3), one can say ‘2. Our numerical 
method may be used on arbitrary geometries on i?’. Furthermore, we have ‘3. 
There is no artificial parameters’ as shown the above. 

In the following we explain the basic idea of our scheme (see also [11]). 
Eymard et al. [4] introduce the following reaction-diffusion system. 



(Ut = AU- kUV 
Vt = -kUV 



{RDk) 



U = A 

t /( x , 0 ) = uo{x) := Wq (x) 
V (x, 0 ) = vo{x) := Wq {x) 



in Qt, 
in Qt, 

on dQ X (0,Tj, 
for X G i7, 
for X G i7. 



where k is a, positive parameter. They show the relationship between Problems 
(RDk) and (SP) in the following 

Proposition 1 (Eymard, Hilhorst, van der Hout and Peletier [4]). Let 

us assume 

(AeWi’\QT)nC{QT), 

< uq{x) = A(x,0) for X G i?, 

[ 0 < uq < A" in 

for some positive constant K. Then {RDk) has a unique weak solution 
^ W^^\Qt) X COd([o,T];L°°(V?)) for every k > 0, and 

jj(^) -A w~^ , w~ as /c oo, 

strongly in LP‘{Qt), where w is a weak solution of Problem (SP). 



To construct an approximate solution of Problem (RDk) foi k = oo (sin- 
gular limit solution), we apply a standard operator-splitting methodology to 
Problem (RDk), that is, we split Problem (RDk) into diffusion and reaction 
parts as follows: 

We construct families of functions {^^}m=o defining 

Step 1: Let U^{x) = uo{x) and V^{x) = vq{x). 
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Step 2: For m = 0, — 1, 

Step 2-1 (diffusion part): Let = 'Hm{tm+i)U'^ . 

Step 2-2 (reaction part): Solve the following initial value problem: 



< 



= -kU^V^ 
= -kU^V^ 



U^{x,tm) = U^{x,tm-^i) 
V^{x,tm) = V^{x) 



in 1? X (tmj^m+l]^ 
in 12 X (tmj^m+l]? 
for X e O, 
for X E f2. 



(4) 



Step 2-3: Put 

r[/-+i(x) = ^-(x,wi), 

\F-+l(a:)^y-(x,Wl)- 

We are now in the position to construct an approximation scheme to Prob- 
lem (SP). Let 0 = k{t — tm)‘ Then (4) becomes 



r = -Ijmym in 12 X (0, kr], 

\ yrn ^ _ljmym ui Q X (0, kr]. 

Passing to the limit as A: — ^ oo, that is ^ ^ oo, we obtain 



f lime^oo U""{x,e) = {U^{x,tm+i) -V”^{x))+, 
\lime^oo V"^{x,9) = {U"^{x,tm+i) -V^{x))-. 



Here we use the facts that {U^ — = 0, U'^{x, tm+i) > 0 and V'^(x) > 0. 

From the above formal calculations, we obtain Approximation Scheme 1. 



3 Numerical results 

In this section, some numerical results obtained by our scheme are shown. We 
deal with a one-dimensional case with a known exact solution and a three- 
dimensional case. In the former, we show accuracy of our numerical interfaces 
by comparing with the exact ones. In the latter, it is demonstrated that the 
moving boundaries can be track even in high-dimensional problems. 

3.1 One- dimensional results 

Let s{t) be the position of the interface at time t. We deal with a simple one- 
dimensional problem with s(0) = 0 and i^(0,t) = C for t > 0, where C is 
a positive constant (see [14]). In this case, it is already known that 

s{t) = 2ay/i, (5) 



where a is a constant satisfying 
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f e ^ dz = Y' ( 6 ) 

Let us compare our numerical interfaces with (5) when a = A = 1, i? = (0, 2) 
and T = 1. We compute a numerical approximation to the integral (6) using 
Simpson’s rule, then we employ 4.0601569 as an approximation to C. Fixed 
uniform grids in space and time are adopted, that is, the spatial mesh size 
is Sx = 2/M and the time mesh size is St = 1/iV, where M is a positive 
integer. Let and vf {0 < i < M, and 0 < n < N) he the approximations 
to u{iSx,nSt) and v{iSx,nSt), respectively, which implies that := — v'^ 

is the approximation to the weak solution w{iSx^nSt) of Problem {SP). The 
initial data are given by 

«o = C,vl= 0, 

= 0, = 1 (0 < i < M). 

We employ the implicit finite difference technique to obtain the numerical 
solution of heat equation. Then our numerical method deduces iterating the 
following two steps. 

Step 1: Define by 

St 

Z.n 
Uq 

^M+1 

Step 2: Compute and {0 < i < M) by 



- 2m" + 



= C, 

= '^M-1- 



Sx‘^ 



(0 < z < M), 





Fig. 1. Numerical interfaces for some Fig. 2. Close-up of the numerical in- 
meshes terfaces in Fig. 1 



We show the exact and numerical interfaces in Fig. 1, where we employ 
the isosurface of level —A/2 of {^/^}o<i<M, o<n<N as the numerical interface. 
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Fig. 2 is a close-up of Fig. 1. One can say that the numerical interface converges 
to the exact one. 

3.2 Three-dimensional simulation 

Our numerical method can be easily adopted multi-dimensional problem. In 
this subsection we compute three-dimensional case with the computational 
domain i? = Fig. 3 shows our numerical simulation, which demon- 

strates how heat fluxes melt ice. In this simulation, we employ the explicit 
finite difference technique to obtain the numerical solution of heat equa- 
tion because of our computational environment. The spatial mesh sizes are 
Sx — 6y = 6z = 1/100 and the time step is St = 3 x 10“^. The boundary 
condition is u\da = 1- The initial data are shown at the top left of Fig. 3, 
where {uq{x, y, z),vq{x, y, z)) = (1,0) holds in the liquid phase (black regions) 
and {uo{x,y, z),vo{x,y, z)) = (0, 1) does in the solid one (white ones). 

We may observe that the interface with complex behavior can be computed, 
that is, one can say ‘4. We can track the interface even in high-dimensional 
problems’ and ‘5. Topological changes and complicated inter facial shapes can 
be handled easily’. 
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Summary. The generalized nonlinear Schrodinger (GNLS) equation is solved nu- 
merically by a split-step Fourier method. The first, second and fourth-order versions 
of the method are presented. A classical problem concerning the motion of a sin- 
gle solitary wave is used to compare the first, second and fourth-order schemes in 
terms of the accuracy and the computational cost. This numerical experiment shows 
that the split-step Fourier method provides highly accurate solutions for the GNLS 
equation. Furthermore, two test problems concerning the interaction of two solitary 
waves and an exact solution which blows up in finite time are investigated by using 
the fourth-order split-step scheme and particular attention is paid to the conserved 
quantities as an indicator of the accuracy. 



1 Introduction 

The generalized nonlinear Schrodinger (GNLS) equation is a nonlinear partial 
differential equation given by 

iwt + Wxx + qi \w\‘^ w-{-q2 \w\^ w + iq3{\w\‘^)j:W + iq^ \w\^ = 0, (1) 

where i = \/^, ic is a complex valued function of the spatial coordinate x 
and the time the parameters ^1,^2, ^3 and q^ are real constants and the 
subscripts t and x denote differentiation with respect to time and space, re- 
spectively. Compared to the usual nonlinear Schrodinger equation with a cu- 
bic nonlinearity, the GNLS equation possesses both cubic and quintic nonlin- 
earities and nonlinear terms that contain derivatives. It has been derived as 
a model equation governing the modulation of a quasi-monochromatic wave 
train in a weakly nonlinear, dispersive medium. 

Some properties of the GNLS equation are summarized here. Assume that 
w and all its derivatives converge to zero sufficiently rapidly as x — > ±00. 
Solutions of the GNLS equation subjected to these boundary conditions are 
known to satisfy some conservation laws [1, 2]. According to these conservation 
laws, the conserved quantities 

/ oo 
-00 



dx 



( 2 ) 
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/ C» -1 -j 

{Kl^ “2(293+94) lm{ww*J - -qi |wl^ 

+ ^[93(293 + 94) - 292] \ wf } da: ( 3 ) 

/ OO 

[2 Im(4t;u)*) - 93 Iwl"*] dx, (4) 

-OO 

where the symbol * denotes complex conjugation, remain constant in time. 
Note that Ii represents the theoretical L 2 norm of the system. Although the 
GNLS equation is generally known as a nonintegrable equation in the sense of 
the inverse scattering method, certain cases of the GNLS equation are com- 
pletely integrable and such equations possess soliton solutions and an infinite 
number of conservation laws. For certain values of the coefficients and certain 
initial conditions, solutions of the general equation (1) experience finite-time 
blow up [1]. Only a few analytical solutions corresponding to some special 
cases of the GNLS equation are available [1, 2]. Therefore, numerical studies 
are essential to develop an understanding of the phenomena related to the 
GNLS equation. 

One of the numerical methods employed for nonlinear dispersive wave equa- 
tions is the split-step method proposed by Tappert [3]. The basic idea in the 
split-step method is to decompose the original problem into subproblems which 
are simpler than the original problem and then to compose the approximate 
solution of the original problem by using the exact or approximate solutions 
of the subproblems in a given sequential order. For nonlinear dispersive wave 
equations which are derived by balancing the effects of dispersion and non- 
linearity, such as the GNLS equation that we will be solving, an appropriate 
approach is to split the original problem into linear and nonlinear subproblems 
which take into account purely dispersive and purely nonlinear effects, respec- 
tively. While various numerical methods have been employed for the numerical 
solutions of the cubic NLS equation in which the split-step method profits from 
the existence of a simple analytical solution for the nonlinear subproblem, less 
attention has been paid to the numerical solution of the GNLS equation of 
which the cubic NLS equation is a special case. A first-order split-step method 
was suggested by Pathria and Morris for the GNLS equation in [2]. 

The main purpose of this study is to introduce higher-order split-step 
Fourier schemes for the GNLS equation and is to compare these schemes from 
a computational efficiency viewpoint. To this end, the initial and boundary- 
value problem is decomposed into linear and nonlinear subproblems. A Fourier 
method is employed for the spatial discretizations of both linear and nonlinear 
subproblems. While the linear subproblem is treated exactly, a fourth-order 
Runge-Kutta scheme is used for the time integration of the nonlinear sub- 
problem. Three different numerical schemes which are basically the first-order, 
second-order and fourth-order versions of the present split-step Fourier method 
are proposed. For an application of the present split-step Fourier method to 
the complex modified Korteweg-de Vries equation, we refer the reader to [4]. 




660 G.M. Muslu, H.A. Erbay 

2 The Numerical Method 



2.1 Review of the split- step method 

It is best to present the split-step method as applied to a general evolution 
equation in the form 

wt = ( 5 ) 

where C and J\f are linear and nonlinear operators, respectively, and L and M 
do not commute with each other. For instance, we have 

^ ^ +*92 |w|^ -q3{\w\‘^)x - 94 \w\‘^ ^ 

for the GNLS equation. If, for the moment, £ and J\f are assumed to be t 
independent, a formally exact solution of equation (5) is given by 

w(x, t -h At) — ex.p[At{C + J\f)]w{x, t) (6) 

where At is the time step between the initial and final times. The linear equa- 
tion Wt = Cw and the nonlinear equation Wt = Mw have known exact solutions 

w{x, t 4- At) = exp{AtC)w{x, t) (7) 

and 

w{x, t -t- At) = exp{AtJ\f)w{x, t), (8) 

respectively. The main idea in the split-step method is to approximate the 
exact solution of equation (5) by solving the purely linear and purely nonlinear 
equations in a given sequential order, in which the solution of one subproblem 
is employed as an initial condition for the next subproblem. This may be 
realized by replacing the exponential operator exp[At{C + A/*)] in equation (6) 
by a solution operator (pn{At) which includes an appropriate combination of 
products of the exponential operators exp{AtC) and exp{AtJ\f). This produces 
a splitting error due to the noncommutativity of £ and A/", and at this stage the 
celebrated Baker- Campbell-Hausdorf (BCH) formula is very useful to reduce 
noticeably the splitting error. In what follows, we study the first-, second- 
and fourth-order versions of the method. According to the BCH formula, the 
first-order approximation of the exponential operator in equation (6) is given 
by 

(fi{At) = exp(AtC) exp{AtJ\f) . (9) 

Therefore, for the first-order version of the split-step method, the advancement 
in time is carried out in two steps. In the first step, a so-called intermediate 
solution is computed by advancing the solution according to the purely non- 
linear equation. In the second step, the solution is advanced according to the 
linear dispersive equation in which the intermediate solution is used as an 
initial condition. In the second-order version of the method, the exponential 
operator in equation (6) is approximated by 
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1 1 

= exp{-AtJ\f) exp{AtC) exp{-AtAf) (10) 

which is symmetric, that is, (p 2 {At)(p 2 {—At) = 1. A fourth-order splitting is 
given in the form 



(p^iAt) = (p2{u;At)(p2[{^ - 2uj)At](p2{^^At) ( 11 ) 

where uj = (2 -j- 2^/^ + 2~^/^)/3. Note that the number of products of expo- 
nential operators increases with the order of decay of splitting error. 

2.2 Space discretization 

Application of the numerical method requires truncation of the infinite interval 
to a finite interval [a, b]. We assume that w{x, t) satisfies the periodic boundary 
condition w{a,t) — w{b,t) for t G [0,T]. If the spatial period is, for conve- 
nience, normalized to [0, 2tt] using the transformation X = 27t{x — a)/{b — a), 
the GNLS equation becomes 

iwt + pwxx 4- qi \wf w i-q 2 \wf w iq^{\w\‘^)xw iq^ wx = 0 (12) 

where 

_ / 27T _ / 27T - _ / 27T . 

P "" ( ft — ’ ^3 = ( r — -) 93 , 94 = ( r — -)94 • ( 13 ) 

The interval [0, 27 t] is divided into N equal subintervals with grid spacing 
AX — 27t/N where the integer N is even. The spatial grid points are given 
by Xj = 2'Kj/N^ j = 0, 1 , 2 , ..., N. The approximate solution to w{Xj^t) is 
denoted by Wj{t). The discrete Fourier transform of the sequence {W j} is 
defined as 



m = [w,] = ^ E exp(-ikXj), _ I < fc < y - 1 . (14) 

j=0 

The inversion formula for the discrete Fourier transform (14) is 

Wj = Tp[Wk] = E exp(zfcX,), j = 0, 1, 2, N -I . (15) 

fe=-f 

Here denotes the discrete Fourier transform and its inverse. These 
transforms can be realized efficiently via a fast Fourier transform (FFT) algo- 
rithm. For the FFT algorithm used here, the integer N must have only prime 
factors 2 and 3. In both linear and nonlinear subproblems we approximate spa- 
tial derivatives in both linear and nonlinear subproblems using discrete Fourier 
transforms. 
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2.3 Time integration 

We consider a split-step method for the GNLS equation, in which the linear 
equation 

wt - ipwxx = 0 (16) 

and the nonlinear equation 

Wt - iqi \w\‘^ w - iq2 \w\^ w + q^{\w\‘^)xw + ^4 wx = 0 (17) 

are solved in a given sequential order corresponding to one of the splitting for- 
mulas (9)-(ll). The linear equation (16) can be solved by means of the discrete 
Fourier transform and the advancements in time are performed according to 

W^+i ^ j^-^[exp{-ipk^At)J^k[Wp]] (18) 

Here At is time step and Wp denotes the approximation to w{Xj^mAt). The 
spatial discretization of the nonlinear equation (17) by a Fourier pseudospectral 
method can be written as 

^=i{qi + 52 TT,) - Tp\ikqM\Wj?\Wi 

-Tp[ikq^J^k[W^\\\W^\\ i = 0,l,2,...,iV-l. (19) 

For the time integration of this equation, instead of using an approximate ana- 
lytical technique [2] we adopt rather a different approach and employ a fourth- 
order Runge-Kutta method. Now the total error involved in integrating from 
time t to time t At will be the sum of the splitting error and the temporal 
discretization error of the nonlinear equation (17). 

The first-order split-step Fourier method for the GNLS equation can be 
summarized as follows: Given the data Wj at any time step t = tm^ first 
advance the solution according to the nonlinear part, namely solve equation 
(19) using the fourth-order Runge-Kutta method for time integration. This 
becomes the initial data for the linear problem which is solved by the discrete 
Fourier transform as indicated by equation (18). The extension of the first- 
order split-step scheme based on equation (9) to the second-order and fourth- 
order split-step schemes based on equations (10) and (11), respectively, is 
straightforward. 



3 Numerical Experiments 

To gain insight into the performance of the suggested split-step schemes we per- 
form the following three numerical experiments. The conservation properties 
of the split-step schemes are examined by calculating discrete analogues of the 
conserved quantities I1P2 and I3. The relative errors in discrete approxima- 
tions to the conservation integrals (2), (3) and (4) are denoted by 61,62 and 63, 
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respectively, and they are defined hy 6 i = |/i - /io|/|/io|, ^2 — |^2 — ^2o|/|^2o| 
and Ss = I/3 — l 3 o\/\Iso\ where I 1 J 2 , h and /10J20, I 30 represent the calcu- 
lated values of the conserved quantities I1J2, h at times t and t = 0 ^ respec- 
tively. 



3.1 Solitary wave solution 



The purpose of the present numerical experiment is to verify numerically that 
the proposed split-step schemes exhibit the expected first-order, second-order 
and fourth-order convergence in time. The GNLS equation has a travelling 
solitary wave solution [1, 2, 5], which has the form 



ic(x, t) 



4 + 3 sinh^fy — 2t — 15) 






-ifl . 



t) = 2 tanh [-tanhfy — 2t — 15)] + x — 15 



( 20 ) 

( 21 ) 



for the choice of coefficients qi = 1/2, Q 2 = —7/4, qs = —1, ^4 = —2. This 
solution represents a solitary wave initially at x = 15 moving to the right with 
velocity 2. The problem is first solved on the space interval 5 < x < 35, as 
in [1, 2], for times up to ^ = 3. We present in Figure 1(a) the Loo-errors of 
the first-order, second-order and fourth-order split-step schemes as a function 
of N for the final time t — 3 on a logj^o “ scale. We use the relation 
At = u{Axy‘ to determine the value of At for a given Ax (= (b — a) /N), where 
the value of is fixed at u = 0.1 . We observe that the Loo-errors decrease 
with increasing N until the boundaries start exerting their influence. The 
nondecreasing error behavior for the second-order and fourth-order schemes 
after the value of AT = 96 is due to the limited space interval 5 < x < 35 . 
To show that this behavior can be eliminated by balancing the error due 
to boundary effects with the error due to internal resolution, we repeat the 
experiment of Figure 1(a) for the space interval — 20 < x < 60. The results 
are presented in Figure 1(b) on a log^o — log^o scale again. But this time the 
effect of the boundaries disappear and the Loo-errors continue to decrease with 
increasing N. 

To test whether the split-step schemes exhibit the expected convergence 
rates in time we perform some numerical experiments for various values of 
time step At and a fixed value of A^. In these experiments we take N = 512 
to keep spatial accuracy high. The results are shown in Table 1. We present 
the Loo-errors for the terminating time t = 3. The convergence rates agree 
well with the expected rates for the first-order, second-order and fourth-order 
split-step schemes. The orders of decay of the Loo-errors are the ones of the 
splitting formulae employed for the temporal integration. 

To compare the proposed split-step Fourier schemes in terms of compu- 
tational efficiency, we fix At and Ax and measure the computing times, the 
L2-error, the Loo-error and the conservation errors 61^62 and 63 at the termi- 
nating time t = 3. Trapezoidal rule is used for the numerical quadrature of 
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Fig. 1. The Loo-erroi^ at t == 3 as a function of the numbM* of spatial grid points 
for the first-order (SSI), second-order (SS2) and fourth-order (SS4) split-step Fourier 
schemes, (a) The space interval: 5 < x < 35, (b) The space interval: — 20 < rc < 60 



Table 1. Comparison of the convergence rates in time for the first-order, second- 
order and fourth-order split-step Fourier schemes in the case of a single solitary wave 
-20 < a: < 60. 





First-order 


Second-order 


Fourth-order 


At 


Loo Order 


Loo Order 


Loo Order 


0.0500 

0.0100 

0.0050 

0.0030 

0.0010 

0.0005 


1.604E-2 - 

3.109E-3 1.019 
1.552E-3 1.002 
9.306E-4 1.001 
3.100E-4 1.001 
1.549E-4 1.001 


2.079E-3 - 

7.975E-5 2.026 
1.991E-5 2.002 
7.167E-6 2.000 
7.962E-7 2.000 
1.990E-7 2.000 


4.425E-4 
1.098E-6 3.727 
7.179E-8 3.935 
9.436E-9 3.972 
1.176E-10 3.991 
7.637E-12 3.945 



the integrals. The results are represented in Table 2. Note that the computing 
times in Table 2 are normalized so that the computing time of the first-order 
split-step scheme is one unit. The results show that each of the conserved quan- 
tities is very well preserved by the split-step schemes. Furthermore, we observe 
that the computing time increases with the increasing order of the split-step 
method. We conclude that the fourth-order split-step scheme is computation- 
ally more efficient than the first-order and second-order schemes. 



Table 2. Comparison of the Loo-error, the L 2 -error, the conservation errors 5i, 
82 and 5^ and the computing times for the first-order, second- and fourth-order 
split-step Fourier schemes (iV = 512, — 20 < a: < 60, At = 0.61038 x 10“^). 



Method 


Loo L 2 5i 82 83 Normalized cpu 


First-order 

Second-order 

Fourth-order 


1.892E-04 1.234E-04 1.691E-13 2.243E-07 9.984E-09 1.0 

2.966E-07 1.134E-07 2.198E-13 1.865E-13 1.381E-13 1.7 

1.716E-11 5.156E-12 5.913E-13 9.309E-14 3.985E-13 4.1 
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3.2 Interacting Solitons 

In the second numerical experiment we study the interaction of two solitons 
for the integrable case of GNLS equation, in which the coefficients are qi = 

^2 — 1, ^3 = — 2, = 0 . The initial condition is given by 

11 1 1 
w{x, 0) = -yzsech[-(x — 15)] expi{-{x — 15) -f tanh[-(a: — 15)]} 
v2 2 4 2 

+^sech[i(x-35)]expi{-i(x-35) + itanh[l(a:-35)]} . 

This equation corresponds to two solitons, the one initially located at x = 15 
and moving to the right with speed 1/2 and the one initially located at x = 
35 and moving to the left with speed 1. The exact values of the conserved 
quantities for this problem are I\ = 3, I 2 — 3/16, and = 0 . The problem 
is solved on the interval — 60 < x < 110 for times up to t = 20 using the 
fourth-order split-step scheme. The numerical results show that the solitary 
waves are stable under the collision. Also each of the conserved quantities is 
very well preserved up to 10“^^ for Ii and I 2 and up to 10“^^ for Is by the 
fourth-order split-step Fourier scheme. This behavior provides a valuable check 
on the numerical results. 

3.3 BloW’Up 

For certain values of the coefficients and certain initial conditions, solutions 
to the GNLS equation experience finite time blow-up [1]. We now apply the 
fourth-order split-step scheme to a case of the GNLS equation in which the 
exact solution blows up in finite time. The initial condition is the Gaussian 
function u;(x,0) = exp(— x^) and the coefficients are qi = —2, q 2 = 20, 
^3 = 0, ^4 = 0 . The exact values of the conserved quantities 7i, I 2 and Is are 
Ii = -\/7r/2 , I 2 = a/7t(^V^ -h 9 — 20 a/ 6)/18 and Is — ^ for this problem. 
In [1], it has been shown analytically that the exact solution w{x^t) for this 
problem will blow up in finite time and furthermore, an upper bound on the 
blow-up time is t « 1.7. 

In the present study, the above problem is solved on the interval — 7.5 < x < 
7.5 for times up to t = 0.08. We present the numerical results obtained using 
the fourth-order scheme on Table 3. Although a formal proof of the existence 
of the blow-up is not presented here, the numerical results strongly indicate 
that a blow-up is well underway by time t = 0.08. This is consistent with 
the numerical results presented in [1] and [5]. The fact that the three results 
about the predicted time of blow-up, which were obtained by totally different 
methods, are in complete agreement makes one believe in their correctness. As 
in [1] we conclude that the upper bound given in [1] is not sharp. 
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Table 3. Variation of discrete approximations of the conserved quantities 7i, h and 
1 3 and I VE(0, t) | with time for the fourth-order split-step Fourier scheme {N = 432, 
At = 0.005). 



t 


h h h |W^(0,t)| 


0.00 

0.01 

0.06 

0.07 

0.08 


1.253314 -2.684467 -9.793286E-17 1.000000 
1.253314 -2.684467 -6.651917E-13 1.007348 
1.253314 -2.684467 -4.673614E-12 1.526254 
1.253314 -2.684448 -4.395637E-12 2.376429 
1.253352 -2.829258 -6.085108E-10 3.430374 



4 Conclusions 

In this study we have applied the well-known split-step Fourier method to 
the GNLS equation. We have presented three split-step schemes in which the 
main difference among the three schemes is in the order of the splitting ap- 
proximation used. The method is easy to implement on a computer and one 
can easily introduce higher-order splitting formulae to increase greatly the ac- 
curacy of split-step method. The numerical experiments reported here show 
that the fourth-order split-step Fourier scheme is advisable in situations where 
accuracy rather than the computational cost is of prime importance. 

The numerical solutions obtained by using the present numerical schemes 
for the case of one solitary wave are compared with the exact solutions in 
order to assess the accuracy of these schemes. In addition, the performance of 
the numerical schemes has been monitored by computing both the conserved 
quantities and the computational costs. We have found that the numerical re- 
sults are in a good agreement with the exact solutions and the results reported 
in the literature and that the schemes have remarkable conservation proper- 
ties for global invariants. Moreover, the collision of two solitons and a finite 
time blow-up problem are investigated numerically and particular attention is 
paid to the behavior of the conserved quantities as an indicator of numerical 
difficulties. 

The approaches presented in previously published papers related to numer- 
ical solutions of the GNLS equation have been mostly limited to the absence of 
the nonlinear derivative terms [6, 7, 8]. The numerical results presented above 
show that the nonlinear derivative terms do not create any special difficulties 
in the split-step Fourier method. 
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Summary. In this note we present some of our recent results concerning flows with 
pressure and shear dependent viscosity. From the numerical point of view several 
problems arise, first from the difficulty of approximating incompressible velocity flelds 
and, second, from poor conditioning and possible lack of differentiability of the in- 
volved nonlinear functions due to the material laws. The lack of differentiability can 
be treated by regularisation. Then, Newton-like methods as linearization technique 
can be applied; however the presence of the pressure in the viscosity function leads to 
an additional term introducing a new non-classical linear saddle point problem. The 
difficulty related to the approximation of incompressible velocity fields is treated by 
applying the nonconforming Rannacher-Turek Stokes element. However, then we are 
facing another problem related to the nonconforming approximation for problems 
involving the symmetric part of gradient: the classical discrete ’Korn’s Inequality’ 
is not satisfied. A new and more general approach which involves the jump across 
the inter-element boundaries should be used, which requires a small modiflcation of 
the discrete bilinear form by adding an interface term, penalizing the jump of the 
velocity over edges. This is achieved via a modified procedure in the derivation of 
a Discontinuous Galerkin formulation. As a solver for the discrete nonlinear systems, 
a Newton variant is discussed while a ’Vanka-like’ smoother as defect correction in- 
side of a direct multigrid approach is presented. The results of some computational 
experiments for realistic flow configurations are provided, which contain a pressure 
dependent viscosity, too. 



1 Introduction 

The flowing of powders brings a new challenging and interesting problem to 
the CFD community: at very high concentrations and low rate-of-strain, grains 
are in permanent contact, rolling on each other. Therefore a frictional stress 
model must be taken into account. This can be done using plasticity and sim- 
ilar theories in which the material behavior is assumed to be independent of 
the velocity gradient or the rate-of-strain. This is in contrast to viscous New- 
tonian flow where stress specifically depends on a rate-of-strain. Furthermore, 
flowing powders do not exhibit viscosity and, again, this shows that a Newto- 
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nian rheology cannot describe granular flow accurately. It is assumed that the 
material is incompressible, dry, cohesionless, and perfectly rigid-plastic. Such 
properties are relevant for modelling the granular flows via special models for 
continuum mechanics, as for instance the Schaeffer model [9]. 

1.1 Equations of motion 

The general equations of describing the motion of incompressible powders read: 
Conservation of mass: ^ + V • {pu) = 0, ^ is the material derivative 

and u is the velocity vector. 

Incompressible material: The bulk density, p, is a constant, so that 
V • u = 0. 

Equation of motion: p^ = —V • T + pp with T = S + pl. 

1.2 Constitutive equations 

The constitutive equation is devoted to correlate between the deviatoric ten- 
sor, S, and the velocity, through the second invariant of the rate deformation 
Du = : D, where the rate of deformation is given by D = ^(Vu + V'^u). 

Newtonian law: S = 2z/oD 
Power law: S = 2i/(Dn)D, iy{z) = r >1 

Schaeffer’s law: For a powder a constitutive equation which was first intro- 
duced by Schaeffer [9] , has to obey a 

— yield condition; ||S|| = \/2psin0, 

— flow rule; S = AD. 

We use this correlation to obtain the constitutive equation 

r D 

T — v2psin(/)|^ +pl . 

1.3 Generalized Navier- Stokes equations 

The problem can be stated in the framework of the generalized incompressible 
Navier- Stokes equations: 

= -'^P + '^ •i'^{P’DjL)D) + pg, V-w = 0 

If we define the nonlinear pseudo viscosity i^(*, •) as a function of the sec- 
ond invariant of the rate deformation Du and the ’pressure’ p, we can show 
that different materials can be ranged within different viscosity laws including 
powder; 

— Power law defined for i'{z^p) = 

— Bingham law defined for i'{z^p) = VoZ~^ 

— Schaeffer’s law (including the ’pressure’) defined for i'{z^p) = sin (f) pz ~ '2 
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2 Problem formulation 



Let us consider the flow of the stationary (!) generalized Navier- Stokes problem 
in (1.3) in a bounded domain i? C If we restrict the set V of test functions 
to be divergence- free and if we take the constitutive laws into account, the 
above equations from (1.3) lead to: 



/ 2i/(D][(u),p)D(u) : D(v) dx -h / (u-'SIu)vdx— / fvdx^ \/veV 

JQ Jn Jo 

( 1 ) 



It is straightforward to penalize the constraint div v = 0 to derive the equiva- 
lent mixed formulations of (1): 

Find (u,p) e X X M (with the spaces X = Hq{Q) and M = L^(17)) such 
that: 



/ 2u{Dj[{u),p)D{u) : D( v) dx / {u-Vu)vdx / pdivvdx 

Jo Jo Jo 

yvex, 



= / fvdx, 
Jo 



/ q div udx = 0, 

Jo 



( 2 ) 



yq e M, 



2.1 Nonlinear solver: Newton iteration 



In this approach, the nonlinearity is first handled on the continuous level. Let 
being the initial state, the (continuous) Newton method consists of finding 
u eV such that 



f 2i/{Dj[{u^),p^)D{u) : D{v)dx 

Jo 

+ [ 2dii^{Dn{u^),p^)[D{u^) ; D(u)][D(w') : D{v)]da 

Jo 



[ 2d2i^{D][{u^),p^)[D{u^) : D{v)]pdoi 

Jo 



— [ — [ 2i^{Dj[{u^),p^)D{u^) : D{v)dx, VvgV, 

Jo Jo 



( 3 ) 



where ^^z ^(-,-);2 == 1,2 is the partial derivative of i/ related to the first and 
second variables, respectively. To see this, set X = D(u^),x = D{u),Y = 
p\y — p,F{x,y) = v{^\xf,y)x and f{t) = F{X +tx,Y + ty), so that 



dxjFi{x,y) = dx^viWx^^ ,y)xjXi v{\\x\'^ ,y)5ij 

dyFi{x,y) = dyp{^\x\^,y)xi 



( 4 ) 
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where 5ij stands for the standard Kronecker symbol. Having 

f 'i (0 = E,- dxjFi{X + tx,Y + ty)xj + dyFi{X + tx,Y + ty)y 
= u{\\X + tx\‘^,Y + ty)xi 

+ div{^\X +tx\^,Y + ty){X + tx,x){Xi +txi) ^ 

+ d 2 v(^\X +tx\‘^,Y + ty)y{Xi +txi) 

we decrease t towards zero, such that we obtain the Frechet derivative: 

V •[ 2i.(Dii(«0,p')D(n) 

+ 29ii/(Dn(«'),y)(D(wO : D(w))D(tx') (6) 

+ 2d2iy{Duiu%p^)pD{u‘)] 



2.2 New linear auxiliary problem 



The resulting auxiliary subproblems in each Newton step consist of finding 
(u,p) G X X M as solutions of the linear (discretetized) systems 



J A{u\p'-)u + 5dA*{u\p^)u + Bp + SpB*{u\p^)p = Ru{u\p’-), 
\B'^u =Rp{u\p^), 


(7) 


where Ru{'^) and i7p(-, •) denote the corresponding nonlinear residual terms 
for the momentum and continuity equations, and the operators A^u\p^), B, 
A*{u\p^) and B*{u\p^) are defined as follows: 


{A{u\ 


,p^)u,v) — / 2v{D^{u),p)D(u) :D{y)dx 
Jq 


(8) 




{Bp^ v)= pV • V dx 
Jq 


(9) 


* 

II 


[ 2diiA{Dji{u^),p^)[D{u^) : D{u)][D{u^) : D{v)]dx 
Iq 


(10) 


{B*{u\p\ 


)v,p) [ 2 d 2 u{Dji{u^),p^)[D{u^) : D{v)]pdx 

Jq 


(11) 



3 Discretization 

We consider a subdivision T e Th consisting of quadrilaterals in the domain 
f2h ^ and we employ the rotated bilinear Rannacher-Turek element [5]. 
For any quadrilateral T, let (^, rf) denote a local coordinate system obtained by 
joining the midpoints of the opposing faces of T. Then, in the nonparametric 
case, we set on each element T 

Qi{T) := span{l,^,ri,^^ -r]'^} . 



( 12 ) 
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The degrees of freedom are determined by the nodal functionals T C 

d%}. 

Fp := |T|“^ J vd^ or Fp := v{mp) [mr midpoint of edge F) (13) 

such that the finite element space can be written as 

•— {'^ ^ L2{f^h),y ^ Qi{T),yT ^Th,v continuous w.r.t. all 

nodal functionals (•), and Fp^{v) — 0,VFio}. ^ ^ 

Here, Fij denote all inner edges sharing the two elements i and while Fio 
denote the boundary edges of dOh- In this paper, we always employ version ’a’ 
with the integral mean values as degrees of freedom. Then, the corresponding 
discrete functions will be approximated in the spaces 

Vh := W^’’’ X W^’’’ , Lh := {qu € L\Q),qh\r = const. ,VT e Th] . (15) 



Due to the nonconformity of the discrete velocities, the classical discrete 
’Korn’s Inequality’ is not satisfied which is important for problems involving 
the symmetric part of the gradient [4], Therefore, appropriate edge-oriented 
stabilization techniques (see [1, 2, 8], have to be included which directly treat 
the jump across the inter-elementary boundaries via adding the following bi- 
linear form 




(16) 



for all basis functions (j)i and (j)j of . Taking into account an additional 
relaxation parameter s — the corresponding stiffness matrices are defined 
via: 

(Sw,w) = s Y, rp [ 

EgEiUEd ' ' 

Here, the jump of a function u on an edge E is given by 






• + u 



n 



tx = < 



u • n 
0 



on internal edges Ej , 

on Dirichlet boundary edges Ed, (18) 

on Neumann boundary edges Fat, 



where n is the outward normal to the edge and (•)"*■ and (•) indicate the value 
of the generic quantity (•) on the two elements sharing the same edge. 



4 Linear solver 

This section is devoted to give a brief description of the involved solution tech- 
niques for the resulting linear systems. For the nonconforming Stokes element 
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Qi/Qo, a ‘local pressure Schur complement’ preconditioner (see [7]) as gen- 
eralization of so-called ‘Vanka smoothers’ is constructed on patches Qi which 
are ensembles of one single or several mesh cells, and this local preconditioner 
is embedded as global smoother into an outer block Jacobi/Gauss-Seidel iter- 
ation which acts directly on the coupled systems of generalized Stokes, resp., 
Oseen type as described in [8] . If we denote by Ru and Rp the discrete resid- 
uals for the momentum and continuity equation which include the complete 
stabilisation term due to the modified bilinear form S as described in (17), one 
smoothing step in defect- correction notation can be described as 



u 



Z-fl 

J+1 



u" 









-1 



Ru{v},p'‘) 

[kp{u\p'‘) 



(19) 



with matrix F — A-{- 6dA* and A, B, A* and B* are the discrete matrices cor- 
responding to the operators in (8), (9), (10) and (11). For the preconditioning 
step only a part of the matrix, i.e. ^ + 5*, is involved. All other components in 
the multigrid approach, that means intergrid transfer, coarse grid correction 
and coarse grid solver, are the standard ones and are based on the underly- 
ing hierarchical mesh hierarchy and the properties of the nonconforming finite 
elements (see [7] and [8] for the details). 



5 Numerical tests 

5.1 Newtonian case 

In this case, the gradient and tensor formulations are equivalent; the accuracy 
and efficiency of the stabilized tensor discretization is checked by comparisons 
with the gradient formulation (see Table 1); the tests have been performed for 
the ’flow around cylinder’ benchmark configuration [10]. For all three formu- 
lations the lift and drag forces are very similar. 



Table 1. Efficiency of the stabilized nonconforming FEM: Lift and Drag forces 



Level 5 | 


\/v 




grad 


tensor 


stab, tensor 


1 


Drag 

Lift 

NL/MG 


31252 X 10“^ 
30898 X 10“^ 
3/3 


31221 X 10“" 
30924 X 10"® 
7/200 


31231 X 10“^ 
30936 X 10“® 
3/3 


1000 
{Re = 20) 


Drag 

Lift 

NL/MG 


55657 X 10“^ 
10180 X 10-® 
11/4 


55531 X 10-"‘ 
10259 X 10-® 
11/12 


55535 X 10“^ 
10277 X 10-® 
11/3 
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5.2 Effect of convection 

The average number of inner multigrid sweeps (MG) per outer nonlinear sweep 
(NL) increases with mesh refinement (see Table 2), due to the more dominant 
influence of the kernel function in the second order differential operator. Since, 



Table 2. Nonlinear iteration (NL)/Averaged multigrid sweeps (MG) per nonlinear 
iteration for different viscosity parameter (Re numbers) and various formulations 
(gradient, tensor and stabilized tensor) and for different mesh levels 





\/v 


1 


10 


1000 


Level 


Formulation 


NL/MG 


NL/MG 


NL/MG 


4 


grad 


3/3 


4/3 


11/4 




tensor 


3/15 


5/17 


11/4 




stab, tensor 


3/3 


5/3 


11/4 


5 


grad 


3/3 


4/3 


11/3 




tensor 


4/140 


5/35 


11/10 




stab, tensor 


4/3 


5/3 


11/3 


6 


grad 


3/3 


4/3 


11/3 




tensor 


7/200 


4/161 


11/12 




stab, tensor 


3/3 


4/3 


11/3 



in contrast, the convection dominates with the increase of the Reynolds num- 
ber, the average number of multigrid sweeps per nonlinear sweep decreases, 
as the influence of the kernel function is getting irrelevant. This may explain 
why many people from the CFD community did not pay much attention to 
this problem before. 

5.3 Power law case 

In this case the nonlinear viscosity has the form h'(z) = iyoZ 2 ~^^z — Dn, and 
the gradient and tensor formulation are not equivalent any more. The quality 
of the solution is checked by comparisons with the well-known and stable 
conforming Q 2 IP 1 approximation; the extended description can be seen in [3]. 
The accuracy of the nonconforming FEM is saved with the stabilized tensor 
discretization, see Table 3. 

5.4 Pressure dependent viscosity 

Finally, the nonlinear (pseudo) viscosity has the form v{p^z) = exp(/^p), and 
we list the number of resulting nonlinear iterations and the averaged number 
of multigrid sweeps per nonlinear iteration for both Newton and Fixpoint 
methods as outer nonlinear solver. Table 4 shows that the presence of the new 
linear operator B* cannot be ignored; otherwise, we destroy the efficiency of the 
Newton method which is necessary for the robust treatment of the significant 
nonlinearity. 
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Table 3. Comparison of the aproximation results for lift, drag and pressure differ- 
ence for two FEM approaches, the stabilized nonconforming QijQo and the classical 
conforming Q 2 IP 1 (see [3]). 



Level Elements 1 


1 Drag 1 Lift | A p NN/NL\ 


Drag 1 Lift | A p \NN/NL 


Power 


1 r = 1.5 


1 r = 1.1 


4 


Qi/Qo 

Q2/P1 


1594.20 

1635.80 


14.25 

14.39 


24.56 

25.09 


9/2 

8/140 


916.02 

953.94 


3.7381 

3.9217 


15.74 

15.82 


12/2 

19/294 


5 


Qi/Qo 

Q2/P1 


1615.60 

1637.60 


14.43 

14.44 


24.81 

25.07 


8/2 

9/723 


935.13 

957.64 


3.9954 

4.0587 


15.82 

15.87 


15/3 

18/1162 


6 


Qi/Qo 


1626.20 


14.46 


24.94 


8/2 


946.22 


4.0592 


15.85 


13/5 



Table 4. Corresponding results for the number of nonlinear iterations and the av- 
eraged number of linear sweeps per nonlinear cycle 



iy{z,p) =exp(/3p) 


Fixpoint 


1 Newton 


Level 


p 


0.1 


0.3 


0.5 


0.1 


0.3 


0.5 


5 


stab, tensor 


6/2 


12/2 


33/2 


3/3 


4/2 


4/3 




gradient 


6/2 


11/2 


34/2 


3/3 


4/2 


4/3 


6 


stab, tensor 


5/3 


11/3 


65/2 


3/3 


3/3 


3/3 




gradient 


5/3 


9/3 


76/2 


3/3 


3/3 


5/3 



6 Conclusion and outlook 



We can conclude our present numerical analysis as follows: 

— The proposed stabilization technique is stable and accurate for the used 
FEM spaces. 

— The full (!) Newton method seems to be necessary for this type of nonlinear 
problem. 

— The multigrid convergence behaviour for this new class of auxiliary linear 
subproblems is 

— (almost) identical for both gradient and deformation tensor formulations: 
The stabilization for nonconforming FEM works fine! 

— depending on the involved pressure terms for both fixed point and Newton 
methods: More investigation should focus on the linear algebraic 
problem^ beside the nonlinear solution procedure! 

In future, we will cover a wider range of granular materials (see [6] for 
a discussion): 



— General equation of motion for a powder 






q{p,p) 






(D-iV-u/) 



+ pg^ with 



Continuity equation 



^ -h V • {pu) — 0, and 



— Normality condition 
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V-« = ^|lD-iV-u/ 

— The yield condition q{p, p) is given by: 



Powder properties 


Non- cohesive 


Cohesive 


Incompressible 


psin 4> 


p sin 4>-\- c cos (j) 


Compressible 


psin (j) 


to 

1 

VH 


psmcjyp'^ — 

qI. 
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Piecewise Polynomial Approximations for 
Linear Volterra Integro-Differential Equations 
with Nonsmooth Kernels* 



Arvet Pedas 

Institute of Applied Mathematics, University of Tartu, Liivi 2, 50409 Tartu, 
Estonia arvet.pedas@ut.ee 



Summary. The piecewise polynomial collocation method is discussed to solve lin- 
ear Volterra- Basset integro-differential equations with weakly singular or other nons- 
mooth kernels. Using special graded grids, global convergence estimates are derived. 
The error analysis is based on certain regularity properties of the solution of the 
initial value problem. 



1 Introduction 

Volterra integral equations and integro- differential equations arise naturally 
in many mathematical models of various physical and biological phenomena. 
The study of their numerical methods has received considerable attention in 
the past. The survey articles [1, 2] and the monograph [3] convey a good pic- 
ture of these developments and contain an extensive bibliography. The present 
paper is most closely related to the works [3, 4, 5, 6, 7, 8, 9] where a discus- 
sion about the convergence of collocation methods for the numerical solution 
of linear Volterra integro-differential equations with weakly singular kernels 
is given. In the present paper we extend these investigations to a wider class 
of equations. First we study the regularity properties of the solution (Section 
2). Then we use these results in the construction and analysis of a piecewise 
polynomial collocation method for solving such equations numerically (Sec- 
tion 4). Using graded grids and an equivalent integral equation reformulation, 
we derive global convergence estimates for the numerical solutions. Our aim 
is to construct approximations which possess maximal convergence order on 
the whole interval of integration. The main results of the paper extend the 
corresponding results of [7, 8, 9] and are formulated in Theorems 1, 2 and 3. 

* This work was supported by the Estonian Science Foundation (Research Grant 
No. 5859). 
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2 Integro-differential equation and smoothness of the 
solution. 

Let 6 G R = (— oo, cxd), b > 0 and set Ab = {(t, s) G : 0 < t < 6, 0 < s < t}, 
Ab = G R^ : 0 < 5 < ^ < 6}. We consider an initial- value problem for a 

linear integro-differential equation of the form 

t t 

y'(t) = Pit)y{t)+Q{t)+ j Ki{t, s)y{s)ds+ J K 2 {t, s)y'{s)ds , 0<t<b,{l) 

0 0 
with given initial condition 

y(o) = 2/0 , yo G R • (2) 

Observe that, in contrast to ’’standard” Volterra integro-diffrential equations, 
the integrand K 2 {t^ s)y'{s) in (1) depends on the derivative y' instead of the 
solution y itself. We assume that K\^K 2 G W^^^{Ab)^ ^ C"^’^[0,6], m G 

N = {1,2,...},z/GR, 1. Here W^^^{Ab), m G N, i/ < 1, is defined as the 

set of all m times continuously differentiable functions K : Ab H satisfying 

( 1 if z/ -h i < 0 , 

< c < 1 -h I log(t - s)| if z/ -f z = 0 , (3) 

[{t-s)-^-^ if z/-fz>0, 

with a constant c = c{K) for all (t, s) G Ab and all non-negative integers i and 
j such that i + j < m. 

It follows from (3) (with z — j = 0, 0 < z^ < 1) that the kernels Ki{t,s) 
and K 2 {t, s) of (1) may possess a weak singularity as 5 — ^ t. In case z/ < 0 the 
kernels K\ and K 2 are bounded on Ab but their derivatives may be singular 
as s — ^ t. In particular, K\ and K 2 may have the form 

Kot^( 3 {t, s) = s)(t - log(t - s)|^, 0<a<l,/?>0, 

where k : Ab His di m times continuously differentiable function. Clearly, 
G W-’-(zAfc), 0 < a < l,.Ko,i G and G 

for 0 < a < 1, /? > 0, with a small £ > 0 (e < 1 — a). Especially, if = 0 
and K 2 = 0 < a < 1, then equation (1) is of type which is often 

referred to as the Basset equation; the last one is playing important role in 
the mathematical modelling of the diffusion of discrete particle in a turbulent 
fluid (see, for example, [10, 11]). 

The set m G N, z/ < 1, consists of functions^ y G C[0,6] which 

are m times continuously differentiable in (0, b] and such that 

m 

sup < c. Here 

j=l 0<t<b 

^ By C[a,6] we denote the Banach space of continuous functions x : [a, 6] R 
with the norm ||a:;|| = max{|x(t)| \ a < t < b}. By c, ci,C 2 , ... we denote positive 
constants, which may be different in different inequalities. 
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1 if A < 0 , 

wa( 0 = ■{ (1 + |logf|)“^ if A = 0, 0<t<b. 

if A > 0 , 

m 

Equipped with the norm = max |t;(t)|+X] sup (wj_(i_^)(t)|yWi(t)|) , 

0<t<b 0<t<b 

b] is a Banach space. Thus, if a function y belongs to 6], m G N, 

z/ < 1, then its derivatives can be estimated by 



1 if j < 1 - z/ , 

\y^^\t)\ < c 1 + |logt| if j = 1 - z/, 
if j > 1 — z/ , 



( 4 ) 



where 0 < t < 6 and j = 0, 1 , . . . , m. Note that C’^[0, 6], the set of m times 
continuously differentiable functions y : [a,b] R, belongs to for 

arbitrary z/ < 1. 

Introducing a new unknown function 

z = y' , ( 5 ) 

and using (2), equation (1) may be rewritten as a linear Volterra integral 
equation of the second kind with respect to z, 

t s t 

(t) = j Ki{t, s) J z{r)dTds 4- / [p{t) + K 2 {t, s)] z{s)ds + fit), te [0, b] , 

(6) 



0 0 
which may also expressed in the form 



where 



and 



t 

z{t) = J K{t,s)z{s)ds + f{t) , fe[0,6], 

0 

t 

fit) = qit) + yopit) +yo j Kiit, s)ds , f e [0, 6] , 



I 

Kit,s) = pit) + j Kiit,r)dT + K 2 it,s) , (f, s) 



e ^b, 



( 7 ) 



( 8 ) 



( 9 ) 



We will employ (6) in the construction of numerical solutions for problem 
{(1),(2)} (see Section 4). For the smoothness analysis of the solution of 
{(1),(2)} is more convenient to use (7) which we write in the form {I—T)z = /, 
where I is the identity transformation and 

t 

(Tz){t)^ J K{t,s)z{s)ds , te[0,b]. 



( 10 ) 
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In the sequel, for given Banach spaces E and F we denote by C{E^ F) 
the Banach space of linear bounded operators A : E F with the norm 
\\A\\ =sup{m 2 || : 2 € E,\\z\\ < 1}. 



Lemma 1. Let Ki,K 2 £ p G m € N, i/ G R, ;/ < 1. 

Then T is linear and compact as an operator from L°^(0,6) to C[0,6]. More- 
over, T is compact as an operator from C'^’^[0,b] to C^’^[0,6]. 



Proof. We present T (see (9) and ( 10 )) in the form T = T01T02 + Ti + T2, 
where the linear operators Toi, T02, Ti and T2 are defined by settings 

t 

{Toiz){t) = p{t)z{t) , {To2z){t) = J z{s)ds, 

0 

t t 

(Tizm = h i{t,s)z{s)ds with Li(i,s) = J Ki(t,T)dT, 

0 s 

t 

{T2z){t) = I K2{t,s)z{s)ds. 

0 

It follows from K\,K2 G that Li{t,s) is bounded for (t,s) G Ab 

and K2{t,s) is at most weakly singular: \K2{t,s)\ < c{t — s)~^,{t,s) G Ab^ 
0 < a < 1 . Therefore Ti,T2 : L'^( 0 , 6 ) C[ 0 ,h] are compact. Clearly, 

Toi G C{C[ 0 ,b],C[ 0 ,b]) and T02 : L^{ 0 ,b) — > C[ 0 , 6 ] is compact. This im- 
plies T01T02 G £(L°°( 0 , 6 ), C[ 0 , b]) is compact. In summary, T01T02 +Ti -hT2 = 
T G £(L°°( 0 , 6 ), C[ 0 , 6 ]) is compact. 

Further, it follows from Ki G W^^^{Ab) that 




3 

Ki{t,r)dT , j = 0, l,...,m. 



and hence Li G W^^^~\Ab) C W^^^{Ab). Since Li,K2 G W^'^{Ab), 
Ti,T 2 G £(C’^’^[0, 6], 6]) are compact (see [7] for details). Since 
1 G W^^^{Ab), we also deduce that T 02 : C’^’^[0,6] ^ C"^’^[0,6] is com- 
pact. If yi,V2 G C^’^[0,6], m G N, < 1 then, by (4), yiy2 G C"^’^[0,6] 
and \\yiy 2 \\m,v < c\\yi\\m,v\\y 2 \\m,v> with a constant c which is indepen- 
dent of yi and y 2 - This implies Toi G £(C’^’^[0, 6], 6]). In summary 

T 01 T 02 + Ti + T 2 = T G 6], 6]) is compact. Lemma 1 is 

proved. 



The regularity of the solution of equation (1) is described in the following 

Theorem 1. Let K\,K 2 G W^^'^{Ab), p,q ^ m G N, z/ G R, 

V < 1. Then equation (7) has a unique solution z G implying that 

problem {(1); (2)} has a unique solution y G 6] for every yo G R. 
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Proof. It follows from p, g G C^^^[0,h] and K\ G that / G 

C'”’'"[0,6], Indeed, / = /i +/2, where (see (8)) fi{t) = q{t) +yap{t), t G [0,6], 
and f 2 {t) = yo JoKi{t,s)ds, t € [0,6], Clearly, /i G C'"'‘"[0,6] and /2 = yoTl, 
with T defined by (10). Since 1 G C^^^[0,b] and T is bounded as an op- 
erator from C'^'^[0^b] to (see Lemma 1), /2 G C’^’^[0,6]. As the 

homogenous equation z = Tz has only the trivial solution 2: = 0, it fol- 
lows from / G and Lemma 1 that I — T has a bounded inverse 

{I — T)~^ G jC{C^'^[0, 6], 6]), and equation (/ — T)z = / has a unique 

solution z — {I — T)~^f £ 6]. In other words, y' G &], implying 

y G 6]. Theorem 1 is proved. 



3 Piecewise polynomial interpolation 

For given A" G N, r G R, r > 1, let = {to, . . . , : 0 = to < • • • < 

be a partition (a grid) of the interval [0, b] given by the grid points 

tj^b{j/NY, j = 0 ,...,A. (11) 

Here r (also called the grading exponent) characterizes the non- uniformity of 
the grid if r > 1 then the gridpoints (11) are more densely clustered near 
the left endpoint of the interval [0, 6]. Let 

~ fe'-i’ ^j] 5 ^3 “ ~ ^j-i 5 i — 1, • • • , • (12) 

For given integers m > 0 and —1 < d < m — 1, let be the spline 

space of piecewise polynomial functions on the grid U'^: 

S^\n^) = |m : G ttto, i = 1, . . . , 

where tt^ denotes the set of polynomials of degree not exceeding m and u\(jj 
is the restriction of u to the subinterval (Jj, j = 1 , . . . , A. Note that elements 
of Sm^\n'^) = {u : u\aj ^ = 1, • • • , A} may have jump discontinuities 

at the interior points ti, . . . , ^a^-i of the grid 

In every subinterval aj ( j = 1 , . . . , A) we introduce m G N interpolation 
points 

tji =tj-i +r]ihj , l = {j = l,...,N) (13) 

where rji^ ... ,rjm do not depend on j and A and satisfy 

0 < 771 < . . . < 7/^ < 1 . (14) 

To a given continuous function 2: : [0, 6] — > R we assign a piecewise poly- 
nomial interpolation function PjsfZ G which interpolates 2; at the 

points (13): {P]s[z){tji) = I = l,...,7u; j = 1,...,A. Thus, P^z is 

independently defined in every subinterval aj (j = 1,...,A) and {P]\fz){t) 
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may be discontinuous at t = tj^ j = 1 , . . . , — 1; we may treat P^z as a two 

valued function at these points. Note that in case r/i = 0, 77^ = 1 (see (14)), 
Pnz G [0,6]. We introduce also an interpolation operator Pjv which assigns 
for every function 2: G C[0^b] its piecewise polynomial interpolation function 
Pnz. 



Lemma 2. [7, 8] Let z G 6], mGN, z/gR;Z/<1. Then sup \z(t) — 

t€[0,6] 



{PNz){t)\ < C£ 



N ’ 



where c a constant not depending on N and 






' N~'^ for m < 1 — u,r > 1 ; 

jY-m(i _|_ ^Qg — 1; 

< N~'^ for m = 1 — I'.r > 1; 

jy-r(i-i/) m> 1 — <r < m/(l — v) ; 

^ for m > 1 — u,r > m/(l — u) . 



(15) 



Lemma 3. Let T : I/°°(0,6) ^ ^^[0,6] be a linear compact operator. Then 
\\T - PNT\\c{L°°(o,b),L°^{o,b)) 0 as N 00 . 



Proof. An easy observation shows that 

\\^^\\c(C[0,b],L°°(0,b)) — ^ ' N" G N , (16) 

where c is a constant not depending on N. On the base of (16) and Lemma 2 we 
obtain that \\z — P/v2:||^oo(o,6) — > 0 as N — » 00 for every z G C[0,6]. Together 
with the compactness of T : L°°(0,6) ^ <^[0,6] this yields the assertion of 
Lemma. 



4 Collocation method 



We look for an approximation v to the solution 2: of equation (6) in the space 
5^_^J(iI](;-) determing v = ^ m > 1, from the following 

conditions: 

tjl s 

= f(iji) + I Ki{tji,s) I v{T)dTds+ 

° (17) 

+ j [p{tji) + K 2 {tji,s)]v{s)ds, l = = 

0 

with {tji}, given by (13). Having determined the approximation 7; for 2:, we can 
also determine the approximation tx for 7/, the solution of initial value problem 
{(l)-(2)}, setting (see (5)) 



u{t) =yo + 




0 



t e [0, 6] . 



(18) 
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Note that the choice of the collocation points (13) with 771 = 0 , = 1 in (14) 

actually implies that the resulting collocation approximation v belongs to the 
smoother polynomial spline space 5^1. ^^(iT^). 

Theorem 2 . Let yo e R, G p,q e C^^^[0,b], m e N, 

u G H, u < 1, and assume that the collocation points (13) ^ with the grid 
points (11) and parameters (14)^ are used. 

Then, for all sufficiently large N , say N > Nq, and for every choice of 
parameters (14) with pi > 0 or prn < 1^ the equalities (18) and (17) determine 
unique approximations u G and v € {with v\aj = {u\ajY , 

j = to the solution y of problem {( 1 ); ( 2 )} and its derivative y' , 

respectively. If pi = 0, rjrn = 1; then u G Sm\ll'^) and v = u' G 
For all N > Nq the following error estimates holds: 

||.«IL<c4”"’. i6{0,l}, (19) 

Here c is a constant not depending on N, is given by (15) and 

ll^^'^lloo = ie{ 0 ,l}. ( 20 ) 

Proof. As we know from Section 2, problem {(1),(2)} is equivalent to the 
integral equation (7) where z = y' and the forcing function / and the kernel K 
are given by ( 8 ) and (9), respectively. We rewrite (7) in the form z = Tz + f, 
with T defined by (10). We find that / G C L°°(0,T). It follows 

from Lemma 1 that T is compact as an operator from L°®(0,6) to L°°( 0 , 6 ). 
Therefore, z — Tz f has a unique solution z G L^{0,b). Moreover, on the 
base of Theorem 1 we obtain that z G b]. 

Further, conditions (17) are equivalent to the operator equation represen- 
tation v = PnTv + Pat/, with Pn defined in Section 3. From Lemma 3 and 
from the boundedness of {I — T)~^ in L°^( 0 , b) we obtain that I — PnT is in- 
vertible in L°°(0, b) for sufficiently large N, say N > Nq. Moreover, the norms 
of (J — PnT)~^ are uniformly bounded in N: 

\\{I ~ PnT) ||£(X/°°(o, 6 ),l°°(o, 6 )) — ^ ’ N > No . ( 21 ) 

Thus, for N > No, equation v = PnTv + Pat/ provides a unique solution 
V G 6'^^j(77](^) {v G Sl^l_-^{IT'^) if 7/1 = 0, T]rn = !)• Tot v and z, the solutions 
of equations v = PnTv + Pat/ and z — Tz f respectively, we have 

v-z = {I- PnT)-\Pnz - z) , N>No. (22) 

Now (21) yields ||t; — 2 ; 00 ( 0 , 6 ) ^ c||Ptv 2 : — 2 :||ao°(o, 6)5 N > No, with a constant 
c which is independent of N. Applying Lemma 2 we obtain the estimate (19) 
with 2 = 1. 

t 

Further, due to (18) and (2), y{t) — u{t) — /[y'(s) — v{s)]ds, t G [0,5]. 

0 

Applying (19) with i = 1 we obtain the estimate (19) with 2 = 0. Theorem 2 
is proved. 
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Thus, according to Theorem 2, in case m > 1 — the approximation order 
\\y — '^\\oo < cN~^ is guaranteed for r > m/(l — v). For u close to 1, z/ < 1, this 
condition on r may be too restrictive. To obtain the order ||y — rz||oo < cN~^ ^ 
the condition on r can be considerable relaxed, as shown in the following 



Theorem 3. yo G R, G K 2 G p,q e 

m G N, z/ G R, ly < 1, m > 1 — u, and assume that the collocation 
points (13) with the gridpoints (11) and parameters (14) are used. Then, with 
the notation of Theorem 2, we have the following estimates for the error y — u: 



1) ifl — u<m<2 — u then \\y — iz||oo ^ cN ^ for r > 1; 

2) if m = 2 — jy then 



\\y- 



w||oo < C 



{ 



N "^(1 4- log A^) for r = 1 ; 
N~^ for r > 1 ; 



3) if m > 2 — ly then 



( ]\J'-r{2-iy) for 1 < r < m/(2 — v ) , 

\\y — 'i^lloo ^ ^ A N~^{1 + log N) for r = m/(2 — ly ) , 

[ N~'^ for r > m / (2 — ly) . 



Proof. Using the equality (/ — PjsfT)~^ =/ + (/ — P^T)~^PnT, we rewrite 
the error (22) in the form v — z = PjsfZ — z (I — Pj\[T)~^ PjsfT {Pjsf z — z), 
N > No . Due to continuity and boundedness of K{t, s) on A^^ T is bounded, 
as an operator from L^(0,6) to C[0,b] (see (9) and (10)). Together with (5), 
(16) and (21) we obtain that 



\y{t)-u{t)\ = 



I 0 

j [^;(s) - v(s)]ds <c j \{Pnz){s) - z{s)\ds, 



0 0 

where 0 <t <b. Since G 6], m > 1 — z/, then (see [12], p.ll6,[7, 8]) 

', j = l,...,N, 



max \ {P]sfz){t) — z{t)\ < c{tj — 

t^<T j 



j 



with {tj}, given by (11). It follows from (11) that 

{tj - ^ j = 1, . . , , AT . 

Therefore, for t G [0, 6], we have 



N J N 

\y{t) - <cj^ \{Pnz){s) - z(s)|c/s < CiiV-"(2-.) 



(23) 

with c and c\ not depending on N . Furthermore, for a number a G R we have 



N { if q: > — 1 , 

< cl 1 + I log A^l if a = — 1 , 
j=i [1 if a < -1 , 



( 24 ) 
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where c is a constant which does not depend on N. Applying (24) with a — 
r{2 — i') — m — 1 to (23) it is easy to see that the statements of Theorem 3 
hold. 

Remarks. 1) The equalities (17) form a system of algebraic equations 
whose exact form is determined by the choice of a basis in (or in 

if T]i =0, T]rn = !)• For instance, in each subinterval [tj-i^tj] {j = 
... ,N) we may use the representation v{tj-i + rhj) — 

T G [0, 1], where denotes the Ith. Lagrange fundamental polynomial 

of degree m — 1 associated with the parameters (14), that is = 

— r]i)/{r]i — rji)^ r G [0,1]. The conditions (17) then lead to a linear 

system of equations for the coefficients Cji = — 1, . . . , m; j = 1, . . . , A". 

2) Method {(17), (18)} where we have discretized the integral equation (6) 

is equivalent to the collocation method applied directly to problem {(1),(2)}. 
In the latter form the collocation method in more particular case {K 2 = 0, 
Ki G 0<i/<l, p, gG C’^[0,6], m G N) has been examined in 

[3, 4, 5, 6]. 

3) The convergence results established by Theorems 2 and 3 are derived 
under the assumptions that all needed integrals in (17) can be evaluated ana- 
lytically. Since this is rarely possible in concrete applications, there arises the 
question how to approximate these integrals so that the resulting fully dis- 
cretized collocation method converges under the same conditions and with the 
same rate as it is proved for the ’’exact” collocation method in Theorems 2 
and 3. This question will be discussed elsewhere. 
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Summary. In this paper, we show that the discontinuous finite element method 
recently developed by Warsa, Wareing and Morel for radiation-diffusion problems 
belongs to a class of generalized local discontinuous Galerkin methods. We then 
derive a priori error bounds for this method and numerically confirm them to be 
sharp. 



1 Introduction 

In the recent work [7] and [6] , Warsa, Wareing and Morel introduced a discon- 
tinuous finite element method for the discretization of a radiation-diffusion 
problem that is represented by a system of two coupled first order equations 
for the zeroth and first angular moments of the particle distribution (the scalar 
flux and current, respectively). These so-called P\ equations arise out of an 
angular Galerkin approximation to the Boltzmann transport equation based 
upon a spherical-harmonic trial space of first order; see [3] or [4] for more 
details. In the absence of time dependence (the case considered here) the re- 
sulting problem finds the current J == J(x) and the scalar flux ^ = ^(x) 
satisfying 

+ 3(7t(x)J = 3 Qi, V • J + cTa(x)^ = Qo, in i? C (1) 
subject to so-called vacuum and reflecting boundary conditions, respectively, 

i J • n = 0 on /V, J-n — 0 on Fr. (2) 

Here, i? is a bounded polygonal {d = 2) or polyhedral {d = 3) domain with out- 
ward normal unit vector n on the boundary F = 5i7, which is partitioned into 
two parts F = Fv^Fr with disjoint interiors. The right-hand sides Qo G L^(i?) 
and Qi G LP‘{QY are the zeroth and first angular moments of an inhomoge- 
neous source. We assume that the material coefficients at and a a belong to 
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and satisfy crt(x) > cr^ > 0 and (Ja(x) > 0 in i? (cr^ = 0 in purely 
scattering subregions). For simplicity, we further assume that ds > 0. 

The method of Warsa, Wareing and Morel approximates both the un- 
knowns J and ^ by piecewise linear functions. It is designed in such a way that 
the radiation energy as well as the radiation momentum are conserved over 
each cell, as in standard, upwind Godunov schemes. In combination with effi- 
cient preconditioning techniques, the results in [5-7] indicate that the method 
of Warsa, Wareing and Morel can be applied to a wide range of problems. 

In this note, we show that the discrete formulation of the Pi equations 
of Warsa, Wareing and Morel belongs to the general class of mixed discon- 
tinuous Galerkin (DG) methods analyzed by Castillo, Cockburn, Perugia and 
Schdtzau in [Ij. This class extends and generalizes the local discontinuous 
Galerkin (LDG) method proposed by Cockburn and Shu [2|. In the original 
LDG approach the vector unknown can be eliminated from the equations in 
a local and element-wise manner. In contrast, the method of Warsa, Wareing 
and Morel belongs to the ’’truly” mixed variants of the LDG method described 
in [1] for which such a local elimination is no longer possible. While this can be 
seen as a shortcoming of truly mixed DG methods it in fact leads to better and 
nearly optimal convergence rates for the approximation of the vector variable; 
see [Ij. Furthermore, as described in [6], the Pi equations are used to accel- 
erate the iterative convergence of the Boltzmann transport equation solution. 
The transport equation is discretized with a spatial DG method and in many 
cases effective acceleration strongly depends on how well the vector unknown 
of the acceleration equations is approximated. This application is what makes 
the use of the truly mixed LDG discretization for the Pi equations necessary 
and motivates the study presented in this paper. 

We apply the theoretical results in [1] and conclude that the method of 
Warsa, Wareing and Morel is well-posed. Moreover, the following a priori error 
bounds hold: for an approximation order /c > 0, the method exhibits conver- 
gence rates in the mesh size of order /c + ^ in a suitable energy norm, and 
of order /c -f 1 in the L^-norm of the scalar flux. We present a set of numer- 
ical convergence tests for a three dimensional model problem on tetrahedral 
meshes that verify the theoretical predictions. We must point out that these 
tests complete the tests in [1] where no numerical results were shown for truly 
mixed DG methods. 



2 Discontinuous Galerkin discretization 

In this section, we detail the mixed discontinuous Galerkin discretization pro- 
posed by Warsa, Wareing and Morel [6, 7] and cast the method in the setting 
of [1], 

We consider shape regular meshes Th of mesh size h that partition the 
domain Q into triangle and/or parallelograms. We allow for irregular nodes, 
in general, but assume that the local mesh sizes are of bounded variation. Using 
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the same notation as in [1], we let Sx be the union of all interior faces of 7/^, 
Ey and Er the union of all boundary faces of Th on Fy and Fr^ respectively, 
and set E — Ex ^ Ey VJ Er. For piecewise smooth vector- and scalar- valued 
functions w and we introduce the following trace operators. Let e C Ex he 
an interior face shared by two elements K'^ and K~ ^ and write for the 
outward normal unit vectors to the boundaries respectively. Denoting 

by and the traces on dK^ taken from respectively, we define the 
jumps across e by |wj == w“*“ • n+ 4- w~ • n~ and {u] = ^ and the 

averages — (w+ -f w~)/2 and = {u'^ u~)/2. On a boundary face 

e C U Er, we set |w] = w • n, {u] = un, J = w and = u. 

Let Th he a, triangulation Th oi Q and /c > 0 an approximation order. We 
wish to approximate (J,^) by a piecewise polynomial function (J/i,^/i); that 
is, G Vk{KY X Vk{K) for all K e %. Here, Vk{K) denotes the 

set of polynomials of degree at most k on K. This approximation is defined 
by imposing that, for all elements K E Th and all test functions (w,n) G 
VkiK)‘^xVkiK), 



-L 



I < 



= 3 1 i 



3 / J/j, • wdx — / ^hV-wdx+ / 

JK JK 

J/i-Vr^dxH- / uJh^Kds-\- / aa^hudx— / Qoudx. 

Ik JdK JK JK 



Qi • wdx. 



( 3 ) 



Here, Jh,K and ^h are the so-called numerical fluxes, that are approximations 
to the traces of J • ur and ^ on the element interfaces and are chosen as follows 
(see [6]). 

First, for an element K'^ and an interior face e shared by K'^ and a neigh- 
boring element K~ ^ we define the following inwardly and outwardly directed 
discrete flows (partial currents) by 



^e,K+ 



1^- It- 

J^h - ■ ^K+, 



Tout 

It'- 



-^t + b+ 



If the face e of is contained in EyUER^ the outwardly directed flow 
is defined as = \^h + ' ^K+- Further, if the face e of belongs 

to Ey^ we set JYk+ ~ whereas JYk+ needed for e <Z Er. 

The numerical fluxes are then taken as 



Jh,k\s = (1 - 0 ( Me =2 [(1 + 0 + (1 - 0 J'M] ’ ( 4 ) 

with ^ = 0, if e C U and ^ = 1, if e C Er. 

This completes the definition of the DG method proposed in [6, 7] for prob- 
lem (l)-(2) (where only the case k = 1 was considered). Notice that, for e 
shared by and K~ ^ Jh,K+ \e = —Jh,K~ |e, whereas the definition i^h\e does 
not depend on which side of e it is taken from. This is the reason for the 
subscript K in Jh,K- The choice in (4) of the numerical fluxes can be motived 
by physical arguments corresponding to the so-called Marshak approximation; 
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obviously the choice is not unique, but consistency with the intended applica- 
tion in which solution of the Pi equations is embedded requires that we make 
this choice; see [6] for more details. Note that the fluxes can further be easily 
adapted to take into account inhomogeneous boundary data. 

We now cast the fluxes in (4) in the setting of [1]. Denote by J/i a vector 
field such that J/i|e • = Jh,K^ for all e C OK (the definition of Jh\e no 

longer depends on which side it is taken from). Then we have 

J/.|e = «J4 + ^ft|e = P4 + Pfcl. 

~ ^h\e — 2^^ if e C ^ 

Jh|e = 0, + 2 J/i • n, e C Sr. 

The form of these fluxes shows that the method in (3) belongs to the class of 
mixed DG methods investigated in [1] and the theoretical results there can be 
used to analyze it. In particular the formulation (3) is consistent and uniquely 
solvable; see [1, Proposition 2.1]. 



3 Error analysis 



In this section, we discuss the a priori error bounds that are obtained for 
the formulation (3). For a piecewise smooth function (w, 7i), we define the 
seminorm 



\{w,u)\l = 3\\a;^ 






1|2 

II 0,£xUSy 



+2||[wl||2. +||a|«||2 +-IIM 



|2 

I Sx^^v ■ 



Whenever aa > 0, |(',*)k actually defines a norm. Our main result is given 
in Theorem 1, which derives directly from the analysis in [1, Theorem 2.2]. 
Strictly speaking, that analysis was carried out for = 0 and simpler bound- 
ary conditions but extension to the current situation poses no difficulties. 



Theorem 1. Assume the exact solution (J,^) of (l)-(2) to belong to 
X with s > 0. Let (Jh,^h) be the DG approximation 

obtained by (3), for an approximation degree k > 0. Then we have the error 
bound 

1(J ^h)\h < (||J|U+ 1 ,^, + m\s+2,n), 

with a constant C > 0 independent of the mesh size h. 

Furthermore, if the domain and the coefficients at and aa are sufficiently reg- 
ular, we also have the -bound 

11 ^ - ^h\\o,o < m\s+i,n + m\s+2,a), 

with a constant C > 0 independent of the mesh size h. 
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The order of convergence in the approximation of the seminorm |(-, -)k 
half a power of h better than for the standard LDG method, due to the truly- 
mixed nature of the numerical fluxes; cf. [1]. Moreover, the above result also 
holds true for A: = 0, i.e., for piecewise constant approximations. For the LDG 
method, no convergence has been observed either theoretically or numerically 
in this case. 



4 Numerical results 

In this section, we present the results of a series of numerical experiments that 
demonstrate the theoretical error estimates of Theorem 1 . 

4.1 Smooth solution 

We start by testing the performance of the method for a smooth solution. 
We consider the radiation-diffusion system (1) on i7 — (0,1)^, with reflecting 
boundary conditions on the faces {x — 0} and {x — 1}, and vacuum boundary 
conditions on the remaining boundary faces. The material coefficients are at = 
1 and a a = 10“^. The right-hand sides Qo and Qi are chosen so that the 
exact solution (J,^) is a polynomial of order 7 and thus arbitrarily smooth. 
The corresponding numerical solutions for A: — 0 and A; = 1 are computed on 
a sequence of tetrahedral meshes {Ti}\^i constructed by uniformly dividing 
the domain into Cartesian grids with 2^, i = 1, . . . , 5, equal intervals in each 
dimension. Each cube in the grid is then subdivided into six tetrahedra of 
equal volume. The mesh size hi on mesh Ti is therefore proportional to 2“L 
In Table 1 we report the errors and the numerical convergence rates 
in the | • -seminorm, in the L^-norm of as well as in the L^-norm of 
J. Clearly, we obtain convergence order A: + ^ in the | • |/j, -seminorm, as well 
as order A: 4- 1 in the L^-norm for This confirms the theoretical results in 
Theorem 1. Also, the || J — part of the | • |/j,-seminorm actually converges 
more rapidly than the whole | • -seminorm, namely with the optimal rate 
A: + 1. The convergence behavior of the different jump contributions to the 
I • |/i-seminorm is shown in Figure 1. Here, we plot the different errors against 
n = 2% which is proportional to h~^. We use the abbreviations /, R and V 
for Sx, Sr and Sy, respectively. Comparison with the line y = dearly 

shows that the jumps over interior faces converge with order A: + ^. On the 
other hand, comparison with the line y = n~^~^ shows that all the boundary 
jumps exhibit a better convergence of the order A: + 1. This indicates that 
the errors in the interior jump contributions are the ones that render the first 
estimate in Theorem 1 sharp. Finally, we note the similar behavior is observed 
for solutions that are piecewise in iJ^-regular; for brevity, these numerics have 
been omitted. 
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Table 1. Smooth solution: errors and convergence rates. 







i(j-jh 




11^ — 


l|J- Jh||o,« 




i 


error 


Ti 


error Vi 


error ri 




1 


7.11 e-2 


- 


1.45 e-2 


3.13 e-2 




2 


4.91 e-2 


5.34 e-1 


8.90 e-3 7.00 e-1 


1.90 e-2 7.18 e-1 


k = 0 


3 


3.21 e-2 


6.14 e-1 


4.71 e-3 9.17 e-1 


1.02 e-2 9.04 e-1 




4 


2.14 e-2 


5.82 e-1 


2.40 e-3 9.73 e-1 


5.21 e-3 9.67 e-1 




5 


1.47 e-2 


5.47 e-1 


1.21 e-3 9.90 e-1 


2.63 e-3 9.87 e-1 




1 


4.40 e-2 


- 


7.22 e-3 


1.21 e-2 




2 


1.91 e-2 


1.20 e-fO 


2.36 e-3 1.61 e+0 


3.71 e-3 1.71 e+0 


k = l 


3 


7.41 e-3 


1.36 e-hO 


6.33 e-4 1.90 e+0 


9.78 e-4 1.92 e+0 




4 


2.74 e-3 


1.43 e-hO 


1.61 e-4 1.97 e+0 


2.47 e-4 1.99 e+0 




5 


9.91 e-4 


1.47 e+0 


4.06 e-5 1.99 e+0 


6.16 e-5 2.00 e+0 



4.2 A two— material problem 

In practice the coefficient at may have strong discontinuities at the interfaces 
between different materials. In these cases, where the vector- valued quantities 

and J are typically smoother than the scalar flux the use of a truly 
mixed method, such as the one considered in this paper, is of particular im- 
portance. In this series of numerical experiments, we consider a problem with 
material discontinuities. The radiation-diffusion equations (1) are solved with 
aa = 0 and a discontinuous coefficient at. We again set i? = (0, 1)^, and specify 
vacuum boundary conditions on the faces {x = 0} and {x = 1}, with reflecting 
boundary conditions on the remaining boundary faces. We define at — a., for 
0 < X < 0.5, at = b, for 0.5 < x < 1, with positive parameters a and 6, model- 
ing a material discontinuity at x = 0.5. The right-hand sides Qi and Qo are 
chosen so that the solution (J,^) is given as follows. Denoting by (p{y,z) the 

polynomial (p{y, z) — y'^z'^^ we take J(x,y, z) = ^|(a-hx(6 — a)) ‘(f{y^ z), 0, 0^ 
and 

& ') = / ™ (o> 0-5) ^ (o> 1) (o> i)> 

2/, - I ^ ^ 

Notice that ^ is piecewise smooth, but only belongs to £ > 0, 

whereas = {^ip{y,z),{x - \)dyip{y , z) , {x - \)dzy>{y,z)^ is a smooth 

function. We use the same sequence of tetrahedral meshes as the numerical 
experiments presented in Section 4.1. 

We consider the two cases a — h = 0.01 and a = 100, h = 1. Notice 
that the jump in the normal derivative of ^ at the surface x = 0.5 is equal to 
3(a — b)y‘^z‘^. This jump is almost two orders of magnitude larger in the case 
a = 100, b = 1 than in the case a — 1, 6 == 0.01. For this reason deterioration 
of the convergence rates due to the lack of smoothness of the exact solution 
might be more serious in the second case than in the first one. 




On a Discontinuous Galerkin Method for Radiation-Diffusion Problems 693 




Fig. 1. Smooth solution: -errors of the jumps in J and ^ 



Table 2 shows the errors and the numerical convergence rates ri in the | • |/i~ 
seminorm and in the L^-norm of as well as in the L^-norm of J, for the case 
a = 1,6 = 0. 01, Order A;+ ^ in the | • |/i-seminorm, and order k-\-l convergence 
in the L^-norm for ^ are obtained. These results agree with the theoretical 
estimates in Theorem 1, even though the elliptic regularity assumptions of 
Theorem 1 for the estimate of the L^-error in ^ are not satisfied. Furthermore, 
the L^-error in J also converges with the optimal order k 1. In Figure 2, 
the errors in the jumps of ^ and J are shown. Again, we see that the interior 
jumps converge at the rates while the boundary jumps show the full 

convergence rate of 0(h^), 
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Table 2. Two-material problem, a = 1, 6 = 0.01: errors and convergence rates. 



k = 0 



k = l 



\{3 


11^ — ^h||o,r? 


1|J -Jhllo.fi 


error ri 


error ri 


error ri 


1.76 e-1 

1.27 e-1 4.67 e-1 
8.95 e-2 5.04 e-1 
6.30 e-2 5.08 e-1 
4.44 e-2 5.05 e-1 


6.15 e-2 

3.18 e-2 9.50 e-1 
1.61 e-2 9.85 e-1 
8.07 e-3 9.95 e-1 
4.04 e-3 9.98 e-1 


5.14 e-2 

2.92 e-2 8.17 e-1 
1.54 e-2 9.21 e-1 
7.91 e-3 9.61 e-1 
4.01 e-3 9.80 e-1 


9.42 e-2 

3.66 e-2 1.36 e+0 
1.33 e-2 1.46 e+0 
4.77 e-3 1.49 e+0 
1.69 e-3 1.49 e+0 


1.59 e-2 

4.06 e-3 1.97 e+0 
1.02 e-3 1.99 e+0 
2.55 e-4 2.00 e+0 
6.38 e-5 2.00 e+0 


1.39 e-2 

3.56 e-3 1.97 e+0 
8.73 e-4 2.03 e+0 
2.13 e-4 2.04 e+0 
5.26 e-5 2.02 e+0 



Now consider the case a = 100, b — 1. The numerical rates in Table 3 
indeed paint a different picture than the previous case shown in Table 2 for 
a = 1 and 6 := 0.01. For both k = 0 and k = 1 the rates in the | • -seminorm 
have deteriorated slightly, but they are still close to k-\- Similar behavior is 
observed in the L^-norm of J. These results are in agreement with Theorem 1. 
The differences in the L^-errors in ^ are more remarkable: the convergence 
rates are smaller than 0.5 for k — 0 and better than 2 for /c = 1. However, 
the oscillations in the numbers might indicate that the asymptotic regime has 
not been reached yet. Finally, we show in Figures 3 the behavior of the errors 
in the jumps of ^ and J for /c = 0 and k = respectively. For k = 0 the 
jumps of ^ on the reflective boundary in this case converge more slowly - and 
suboptimally - than in the case a — 1,6 = 0. 01, while the other jumps exhibit 
rates of convergence similar to those for the case a = 1, 6 = 0.01. For k = 1 
the jumps of ^ on the reflective boundary exhibit convergence similar to those 
for k = 0. However, convergence is at the optimal rate of 0 {h 2 ) that is, half 
an order less than the L^-norm. All the other jumps show a convergence rates 
as for the k = 0 case. 



5 Conclusions 

The discontinuous Galerkin method of Warsa, Wareing and Morel introduced 
in [7] and [6] is a (truly) mixed discontinuous Galerkin method that belongs 
to the general class of methods analyzed in [1]. The analysis there ensures 
well-posedness and a priori error estimates, which have been verified in a se- 
ries of numerical experiments. It is important for the specific applications in 
which solutions of the Pi equations are needed that convergence has been 
demonstrated for the vector unknown, even in the case A: = 0. 
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k=0, a=l, b=0.01 




Fig. 2. Two-material problem, a = 1, 6 = 0.01: L^-errors of the jumps in J and ^ 
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1 

1 


11^ — ^^llo,i7 


1|J- 




i 


error 


n 


error ri 


error ri 




1 


1.47 e+1 


- 


7.76 e+0 


3.83 e+0 




2 


1.05 e+1 


4.84 e-1 


5.05 e+0 6.21 e-1 


2.01 e+0 9.29 e-1 


k = 0 


3 


7.49 e+0 


4.90 e-1 


3.49 e+0 5.34 e-1 


1.04 e+0 9.56 e-1 




4 


5.32 e+0 


4.95 e-1 
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5 


3.79 e+0 


4.87 e-1 
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1 


5.36 e+0 


- 


2.68 e+0 — 


8.23 e-1 




2 


2.09 e+0 
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Summary. Multi-phase flows are frequently modeled in engineering fluid mechanics. 
In this work incompressible two-phase flows are considered. The present research 
aims to model high density-ratio flows with complex interface topologies, typically 
air/water flows. Applications are mixtures of bubbles and droplets. Aspects which 
are taken into account are: a sharp front (density changes rapidly), arbitrary shaped 
interfaces, surface tension, buoyancy and coalescence of drops/bubbles. Attention is 
paid to mass-conservation and integrity of the interface. 

A survey of available computational methods is performed in [1]. The computa- 
tional method used in this paper is the Mass Conserving Level-Set method (MCLS, 
[2]). The MCLS method is based on the Level-Set methodology, using a VOF- function 
to conserve mass. This function is advected without the necessity to reconstruct the 
interface. The ease of MCLS is based on an explicit relationship between the Volume- 
of-Fluid function and the Level-Set function. The method is straightforward to apply 
to arbitrarily shaped interfaces, which may collide and break up. 



1 Introduction 

Various methods have been put forward to treat multi-phase flows. A classifi- 
cation is given in [1]. The two methods that are most suitable for the current 
research are the Volume- of- Fluid (VOF) method and the Level-Set method. 
For both methods a marker function is used to define the interface. In case of 
the Volume-of-Fluid method, a marker function, say indicates the fractional 
volume of a certain fluid, say fluid ‘1’, in a computational cell. It can be seen 
as the concentration of the marker particles of the MAC-method, when the 
number of particles goes to infinity. 

An alternative to the Volume-of-Fluid method is the Level-Set method 
([3, 4]). The interface is now defined by the zero level-set of a marker function, 
say ^ = 0 at the interface, ^ > 0 inside fluid ‘1’ and ^ < 0 elsewhere. 
The function is chosen such that it is smooth near the interface. This eases 
the computation of interface derivatives. Also, methods available from hyper- 
bolic conservation laws can be used to advect the interface. The interface is 
(implicitly) advected by advecting as if it was a material constant: 

^ + u-V^ = 0. (1) 
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The Level-Set method has some advantages over the Volume-of-Fluid 
method. Especially when solving the flow-field is concerned, since interface 
normals, curvature and distance towards the interface can be expressed eas- 
ily in terms of ^ or derivatives of Also, advecting the interface is possible 
by the application of ‘of-the-shelf’ techniques available from hyperbolic con- 
servation laws. For these reasons, the Level-Set method has been chosen as 
the basis of our work. However, mass-conservation is not an intrinsic prop- 
erty and is considered the major drawback of the Level-Set method. Our work 
focuses on a mass-conserving way to advect the interface by means of the 
Mass-Conserving Level-Set method (MCLS, [2]). 

The MCLS method has a shared foundation with the CLSVOF method ([5, 
6]) and to a lesser extend with the combined Level- Set /particle method ([7]) in 
the sense that it is based on Level- Set and additional effort is made to conserve 
mass. The difference with CLSVOF is that here there is no combination of 
two existing methods. The method takes full advantage from all additional 
information provided by the Level-Set function rather than coupling Level- 
Set with Volume-of-Fluid/PLIC. In fact we use the Volume-of-Fluid function 
as a help variable to conserve mass, without applying the difficult convection 
(namely interface reconstruction) which makes the VOF so elaborate. The key 
issue of our method is that we define a simple relationship between the Level- 
Set function ^ and Volume-of-Fluid function This relation is obtained by 
assuming piecewise linear interfaces within a computational cell: 

0^ = /(^,V^). (2) 

It makes the advection of the Volume-of-Fluid function ^ easy (i.e. without 
interface reconstruction) and finding ^ from ^ a straightforward task. This is 
carried out by well-known numerical tools, like Picard and Newton iterations. 
The PLIC method is not adopted (unlike CLSVOF), yet mass is conserved 
in the same manner. Note that the CLSVOF method might not be easily 
extendible to 3D space. Yet the extension of MCLS to three-dimensional space 
can be done in a straightforward way. Note also that with this approach, it is 
not necessary to smooth (or regularize) which is common for other methods. 



2 Governing Equations 

Consider two fluids ‘O’, and ‘1’ in domain i? G IR^ which are separated by an 
interface S. Both fluids are assumed to be incompressible, i.e.: 

V • u = 0, (3) 

where u = is the velocity vector and u and v are the velocities in 

X- and y-direction respectively. The flow is governed by the incompressible 
Navier- Stokes equations: 
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^ + u • Vu = -ivp+ • /U (Vu + Vu*) + g, (4) 

where p, p, fi and g are the density, pressure, viscosity and gravity vector 
respectively. The density and viscosity are constant within each fluid. Using 
the Level- Set function ^ these can be expressed as 

M = Mo + (mi - (5) 

and similar for p, where the subscript indicates the corresponding fluid and H 
is the Heaviside step function. 



2.1 Interface conditions 



The interface conditions express continuity of mass and momentum at the 
interface: 

[pn — n • /i (Vu + Vu^)] = fj/^n, ^ ^ 

where the brackets denote jumps across the interface, n is a normal vector 
at the interface, a is the surface tension coefficient and k is the curvature 
of the interface. Clearly, the velocity u is continuous at the interface. If the 
viscosity /i is continuous at the interface, it can be shown that the derivatives 
of the velocity components are continuous too ([8, 9]). In that case Eqn. (6) 
reduces to [u] = 0 and [p] — an. To achieve that, the viscosity is forced to be 
continuous by smoothing Expression (5): 

/i = /io + (/ii - (7) 

where is the smoothed (or regularized) Heaviside step function 



Ha{x) 



0 



i(l + sin(fi7r)) 

1 



X < —a 
|x| < a 
X > a 



( 8 ) 



and a is a parameter proportional to the mesh width. Here a is chosen as 
(following [10]) a — |/i, where h is the mesh width. According to [11], the 
viscosity is then smoothed over three mesh widths, provided |V^| = 1. Note 
that only the viscosity is smoothed, not the density p. Note also that when 
the density is not regularized, mass is conserved when the volume of a certain 
fluid or phase is conserved. In fact, the MCLS method conserves volumes 
by construction. Due to the non-regularized density-field, mass is conserved 
too. Instead of taking into account the pressure-jump at the interface due 
to the surface tension forces, the continuous surface force/stress (CSF, [12]) 
methodology is adopted. 
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3 Computational Approach 

The Navier-Stokes equations are solved on a Cartesian grid in a rectangular 
domain by the pressure-correction method ([13]). The unknowns are stored in 
a Marker- and- Cell (staggered) layout ([14]). For the interface representation 
the Level-Set methodology is adopted. The interface conditions are satisfied by 
means of the continuous surface force (CSF) methodology. The discontinuous 
density field is dealt with similarly to the GhostFluid method for incompress- 
ible flow ([8]). Further information about the flow- field computations can be 
found in [2]. 

3.1 Interface advection 

The strategy of modeling two-phase flows is to compute the flow with a given 
interface position and subsequently evolve the interface in the given flow field. 
In the foregoing, it has been described how the flow is computed with a given 
interface position. Next we consider the evolution of the interface. 

Level- Set The interface is implicitly defined by a Level-Set function More 
precisely, the interface, say 5, is the zero level-set of 

S{t) = {x G IR^|^(x, t) = 0} . (9) 

The interface is evolved by advecting the Level- Set function in the flow field 
as if it were a material constant (Eqn. (1)): 

— -I- u . - 0. (10) 

A homogeneous Neumann boundary condition for ^ is imposed at the bound- 
aries. It will be clear that accuracy of the approximation of Eqn. (10) de- 
termines the accuracy of the interface representation. The accuracy will also 
determine the mass errors. For this purpose, the discretization of the gra- 
dient of ^ can be either first order upwind, or second or third order ENO 
([10, 11, 15]). In case of the first-order spatial discretization, a forward Euler 
temporal discretization is sufficient. In case of the higher order spatial dis- 
cretization, a Runge-Kutta scheme is applied (e.g. [16]). 

MCLS The difficulty with the Level-Set method is, that although ^ might 
be conserved, this does not imply that mass is conserved. On the other hand, 
with the Volume-of-Fluid method, mass is conserved when ^ is conserved. In 
order to conserve mass with the Level- Set method, corrections to the Level- 
Set function are made by considering the fractional volume of a certain fluid 
within a computational cell. First the usual Level-Set advection is performed: 
first-order advection and unmodified re-initialization. Low order advection and 
re-initialization will ensure numerical smoothness of Furthermore, when the 
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flow-field is computed, higher order accuracy might not be expected when the 
CSF method is applied and viscosity is regularized. In that respect, higher 
order discretization of Eqn. (10) will only lead to improved mass conservation 
for the pure Level-Set methods. Since the obtained Level-Set function 
will certainly not conserve mass, corrections to are made such that 

mass is conserved. This requires three steps: 

1. the relative volume of a certain fluid in a computational cell (called 
‘volume-of-fluid’ function iZ^) is to be computed from the Level-Set function 
^n. ^ ^ /(^, v^); 

2. the volume-of-fluid function has to be advected conservatively during 
a time step towards 

3. with this new volume-of-fluid function corrections to are 

sought such that holds. 

These three steps will be explained subsequently. 

Step 1: Volume- of- Fluid function A relationship between the Level-Set func- 
tion ^ and the so-called volume-of-fluid function F is found by consider- 
ing the fractional volume of a certain fluid in a computational cell f?/^. In 
this paper, a Cartesian mesh is employed consisting of computational cells 

= 1,2, By x/c = {xk^VkY the center node of f2k is meant and Ax 

and Ay are the mesh sizes in x and y direction respectively. The volume-of-fluid 
function Fk is defined in terms of Level- Set function ^ by 

Qk 

where H is the Heaviside step function. The Level-Set function ^ is linearized 
around ^/c, which leads to 



^k = f{^k,V^k)^ (12) 

Note that in contrast with other approaches, the Heaviside step function is 
not regularized. After some mathematical manipulations, the function / is 
evaluated as 



fi^k.V^k) 



(0 

1 {^maxk "b ^k) 

^maxk ~ ^midk 
1 + 

^maxk “b ^midk 
-j 1 {^maxk ^/c) 
^ ~ 2^2 _ ^2 
^maxk 



midk 



'^^maxk 
^max'^ ^k ^^midk 

^midt^ ^k ^ ^midk 

^max'^ ^k ^ maxk 

^k ^ ^maxki 



(13) 



where 
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^maxk — I I “^2/ /el) ^ 



D 1 — At — 

Uxk — dx k 



_ 1 
■midk — 2 

Dy, = Ay ^ 



\^Vk\ \^xk\ 

d£\ 

dy L ’ 



which are approximated by central differencing. 



(14) 



Step 2: Volume- of- Fluid advection At a certain time instant the volume-of-fiuid 
function can be computed from ^ by means of Eqn. (12). The volume-of-fiuid 
function after a time step is found by considering the flux of fluid F that flows 
through a boundary T of a computational cell during time-step At: 



. , = (Z/” , . , — 

^+ 2 ». 2+2 ^+ 2’-/+ 2 Ax Ay 






(15) 



The fluxes are again computed by linearizing ^ (just like Eqn. (13)). In fact, 
the fluxes are computed by the straightforward application of /. 

It is possible that fluid is fluxed more than once through different faces, 
which would cause unphysical values of F. As reported in e.g. [1], this can be 
solved by employing either a multidimensional scheme or flux-splitting. For 
reasons of simplicity we have chosen for the second approach. The order of 
fluxing is: first in x-direction, then in y-direction. Currently the flux-splitting 
of [5] is adopted. As reported in [5], undershoots and/or overshoots can still 
occur, which leads to unphysical values of F, namely < 0 and > 1. If these 
values are replaced by 0 and 1 respectively, mass errors arise which are of 
order 10“^. This is also experienced in the current research. Mass errors are 
completely avoided by redistributing F ([2]). 



Step 3: Inverse function Having found a new Volume-of-Fluid function 
the initial guess of the Level-Set function (after Level-Set advection) is 

modified, such that mass is conserved within each computational cell. In other 
words, find (^i, ^ 2 , • • • )? s^ch that 

Vfc = l,2,..., ( 16 ) 



where e is some tolerance. It will be clear that due to the behavior of F no 
unique solution ^ exists. However, a (small) correction to is searched, where 
comes from Level- Set advection. A solution ^ is found by the following 
iteration (until convergence): leave F unmodified in a grid point when the 
Volume-of-Fluid constraint is satisfied and make corrections locally when this 
constraint is not satisfied. This is achieved by using the inverse function g of 
/ as given in Eqn. (13) with respect to argument ^k- 






( 17 ) 
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4 Applications 



The behavior of the MCLS approach is shown by a couple of standard advec- 
tion tests, with a prescribed velocity field. Thereafter, the method is applied to 
the complete set of equations by considering a falling drop and a rising bubble 
respectively. 

4.1 Advection tests 

Linear advection The first advection test is a circle, which is advected by an 
uniform velocity field. The velocity field is prescribed by {u^v) — (0, —1). The 
dimensions of the computational domain are: Lx — 10 and Ly — 100. which 
is discretized by a lOxlOO-mesh. Initially a circle of radius Rq is placed at 
X = Lx/2 and y — Ly — 2Ro. For the case of i7o = 4 (a circle with a diameter of 
8 mesh sizes), the relative mass is plotted in Fig. 1 as function of the traversed 
distance of the circle. First-order, second-order and third-order pure Level-Set 




Fig. 1. Relative mass for the linear advection test; e = 10 ^ (every 10*^ iteration 
marked) 



simulations (with and without re-initialization) are compared with the MCLS 
method. ENO discretization is adopted for the pure Level-Set method (see 
aforementioned references). The order of re-initialization is in agreement with 
the order of advection. The tolerance in the VOF advection is taken to be: 
e = 10“^. Globally speaking it can be said that mass is always lost for the 
pure Level-Set advection. Mass losses are smaller for higher accuracy and re- 
initialization causes much higher mass losses. The MCLS method conserves 
mass up to the specified tolerance. 
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4.2 Air/ water flow 

In [8] a two-dimensional rising air bubble in water is considered. The dimen- 
sions and sizes are: Lx = 0.02 m, Ly — l^Lx^ Ro — — Vo — \Lx- The 

gravity and material constants are: g = 9.8 a — 0.0728 = 10^ 

Pa = 1.226 pw = 1.137 10“^ ^ and pa = 1-78 10“^ where subscripts 
yj and a indicate water and air respectively. 

Results are shown in Fig. 2(a) for three different mesh sizes. We take e = 
10~^. Relative mass losses are of the same order and in agreement with the 
advection tests. Note that the number of grid cells is much smaller than in 
[8]. The results are the same for t < 0.025 for all mesh sizes. Thereafter small 
differences occur. The results compare well with [8]. The MCLS method seems 
to result in a more coherent structure at the highly curved part of the interface 
at t = 0.05. This is thought to be caused by the low resolution of the grids 
used here. 
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(a) Rising bubble (b) Falling droplet 



Fig. 2. Air/water flows; — • — : 30 x 45; : 40 x 60; — : 60 x 90 mesh 

In Fig. 2(b) results are shown for a falling droplet. The conditions are 
the same as for the rising bubble, except for the sign of ^ at t = 0 and 
Pq = Lx. Mass conservation properties are the same as before. The result are 
the same until the droplet hits the bottom. Thereafter differences occur. This 
is thought to be due to limited number of grid cells available to capture the 
flow-phenomena near the wall. The results compare well with [8]. Note that 
the results in [8] span t < 0.05; no results after collision are presented. 

5 Conclusion 

The mass Conserving Level-Set (MCLS) has been presented. The method is 
based on the Level-Set methodology, where mass is conserved by considering 
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the fractional volume of a certain fluid within a computational cell. Advection 
tests were used to compare the method with the Level-Set method. Mass is 
conserved up to a specifled (vanishing) tolerance. The MCLS method combines 
the attractiveness of the Level- Set method with the mass-conserving proper- 
ties of the Volume-of-Fluid methods, without adopting the latter. This makes 
the implementation much easier than for a Volume-of-Fluid (based) method, 
especially in three-dimensional space. The applicability of the MCLS method 
was illustrated by the application to air- water flows. It is possible to capture 
bubbles or droplets within a limited number of grid cells without mass losses 
up to the prescribed tolerance. This is an important feature, since future work 
will concern three-dimensional problems /geometries, where the amount of grid 
cells available to an individual entity will decrease considerably. 
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Summary. This paper deals with problem of numerical solution of laminar viscous 
incompressible stationary and unstationary flows through vessel with bypass. One 
could describe these problems using model of Navier-Stokes equations and And steady 
solution of unsteady system by using multistage Runge-Kutta method together with 
time dependent artiflcial compressibility method. Non-st at ionary solution is achieved 
from initial stationary solution by prescribing of nonstationary outlet conditions. 
Some results of numerical solution of cardiovascular problems are presented: station- 
ary and unstationary 2D flows in a vessel and a bypass. 



1 Mathematical model 

In the cardiovascular system we could find many different types of vessels like 
large arteries, vessels of medium size and capillaries. They differ in diameter 
and in thickness and composition of the wall. In larger vessels the blood flow 
can be assumed to behave as an incompressible continuum. One can describe 
this type of flow using system of momentum and continuity equation written 
in conservation form: 

Dw 

P^-V.. = pf (1) 

V • w = 0 (2) 

where r is stress tensor of the fluid, w is velocity vector and f is vector of 
external forces, which is later not taken into account. Density of the fluid p 
is supposed to be constant in physiological conditions, although it depends on 
the red cells concentration. The functional dependence of the stress tensor r 
on velocity vector w and the blood pressure p is descented by the following 
relations: 



Tij — P^ij “b ^ij 

dm 

dxj 




dwj 

dxi 



( 3 ) 

( 4 ) 



Equations (3) and (4) describe Newtonian fluid. Important feature of blood 
flow is pulsatility caused by the periodic motion of the heart. It is also known 
[2] that there is scarcely any turbulence in vessels except some special cases. 
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The walls of a tube which is the model of a vessel are supposed to be riggid 
and the velocity vector w is null on them. Blood flow can be assumed to be 
laminar [2]. Indeed, in physiological conditions, the values of speed involved 
are low enough. Morover, generally, the periodicity of the flow, together with 
short length of vascular districts, do not give rise to fully developed turbu- 
lence. Reynolds number Re = is important feature of the flow behaviour. 
Quantity ic* is characteristic velocity, 1 / — ji/p is kinematic viscosity and d 
is a lenght scale. In large and medium human vessels, the Reynolds number 
ranges from 400 up to 10000. In stationary case pulsating nature of blood flow 
is not considered. Elasticity of vessel tubes is not considered in both cases. The 
flow could be then described as viscous, incompressible, laminar and stationary 
(unstationary) in 2D by the system of Navier-Stokes equations without influ- 
ence of exterior forces and heat exchange. The system is written in conservative 
non-dimensional vector form rewritten from (1), (2) 

RWt +F^+Gy = R-^ + Wyy) , (5) 

where W = (p, is the vector of solution, R = diag||0, 1, 1|| and F = 
'p^uvY' = {y^uv^v^ + denote inviscid fluxes, {u^v) is velocity 

vector, p denotes pressure. Re = - is Reynolds number, where U* is speed 

of upstream flows, L* denotes widtTi of a channel, r/* is a reference kinematic 
viskosity (* means reference dimensional quantity). For upstream boundary 
conditions we use velocity vector (^^,^^), along the walls vector of velocity is 
equal zero because of viscosity of fluid and impenetrability of wall, downstream 
boundary condition is p — p 2 , which should ensure pressure gradient. 



2 Numerical model 

Solution of the system (5) is obtained using method of artificial compressibility, 
then equation of continuity is completed with term -^Pt, where G R~^. 
Rewritten in vector form the improved system (5) is following 

Wt + F:,+Gy = R^ + Wyy) , (6) 

where W = Finding steady solution one could solve unsteady sys- 

tem (6) by finite volume method together with time dependent method. System 
of equations (6) could be solved using three stage Runge-Kutta method using 
given steady boundary conditions. At the inlet extrapolation of pressure is 
used. At the outlet the value of pressure is set constant for stationary case and 
for unstationary case the pressure is prescribed by sinus function in form: 

P2 =P2o(l +asin27ra;^), (7) 

where a; is a frequency and a is an amplitude. Multistage Runge-Kutha method 
is stabilized by artificial viscosity term (Jameson’s type): 
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( 8 ) 

( 9 ) 

( 10 ) 



Wn ^ T^(0) 

i,J 

- arAtRW^^y^\ (r = 1, . . . , m) 

WJ^+^ = W^f,m = 3, 

where 

(11) 

and coefficients a\ = 0.5, a 2 = 0.5, o^s == 1.0, so the numerical method is 
second order in time and space. The form of steady residual RWJ^j depends 
on the method used for solving space derivatives: 

(12) 

k=l 

where — F,G^ — G and F^ — (0,Ux,Vx)'^ ,G^ = {0,Uy,Vy)'^. Artificial 
viscosity term in this case depends on the second derivatives of pressure 

and is used for improving stability of the solution: 



DWij = - 2Wij + 






E = diag||0,€i,e2||,ei,e2 G R 

7i = max(7ii , 7 * 2 ), 7j = max(7ji , 7 ^ 2 ) 

_ |Pi+lJ ~ ^Piyj + Pi-1, j I ^ _ \Pi,j ~ + Pi-2, j\ 

\Pi-\-l,j ‘^PiJ Pi-1, j\ \Pi,j "b ^Pi—l,j “b Pi—2,j\ 

\Pi,j-\-l ‘^Pi,j “b Pi^j—l\ \Pi,j ‘^Pi,j — 1 “b Pi,j — 2\ 

\Pi,j+l + ^PiJ + Pi,j-l\ ’ \Pi,j + 2pi,j-l + Pi,j-2\ 



(13) 

(14) 

(15) 

(16) 
(17) 



Time step is obtained from the formula, which is needed for the stability 
limitation of the RK method, where CFL — 2: 



At < min 

i,j,k 



2 fAx'^ + A 

PaAvh + psAxk + - 5 - 

Fc y fJjij 



(18) 



where fiij — . . dxdy. The convergence of iterative process is followed using 

the behavior of residual in space L 2 



\\RW-a\, 



MN 






■ W: 






At 



2 . 1/2 



(19) 



resp. log||W/^.|| 2 , 



where MN is the number of finite volumes. 
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3 Some numerical results 

In this section we present numerical results achieved using above described 
numerical methods. Firstly we show results of stationary cases, where the 
outlet condition is stationary. For lower Reynolds numbers like 500 or 1000 we 
could see good convergence of scheme and smooth results, for higher Reynolds 
numbers we have good convergence when using artificial viscosity. We must 
take into acount that the edges between bypass and vessel are sharp. This 
problem should be eliminated in the future. We could see zones of separation 
in bypass after bifurcation and also in the domain after contraction of vessel. 
In the second part of our results we present our first results of unstationary 
flow. We start from steady solution for specific Reynolds number and using 
the same method as before only changing the outlet condition from stationary 
to unstationary we obtain results of unstationary flow. Results presented here 
are for 20 percent contraction of vessel and for bypasses which are about 40 
or 30 percent of diameter of vessel. Also the distance between the vessel and 
bypass is in the presented cases constant. For unsteady solution is necessary 
to use a — > oo or computation in dual (artifitial) time. 

Acknowledgement: This wok was partly supported by grant GACR No. 201/00/0684 
and Research Plan MSM 210000010. 




Fig. 1. Bypass flows no. la, Re— 1000, vector field of velocity 




Fig. 2. Bypass flows no. lb, Re— 1000, izolines of velocity 
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Fig. 15. Bypass unst. flows no. 7, Re=500, vector field of velocity 
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Summary. A new a-posteriori error estimator is presented for the verification of 
the dimensionally reduced models stemming from the elliptic problems on thin do- 
mains. The original problem is considered in a general setting, without any specific 
assumptions on the domain geometry, coefficients and the right-hand sides. The es- 
timator provides a guaranteed upper bound for the modelling error in the energy 
norm, exhibits the optimal convergence rate as the domain thickness tends to zero 
and accurately indicates the local error distribution. 



1 Introduction 

The method of dimension reduction is a popular approach frequently used 
by engineers for the approximate solution of the problems posed in thin do- 
mains. The term “thin” means that the size of the original physical domain 
along one coordinate direction is much smaller than along the others; this 
allows to make some simplifying assumptions on the behaviour of the exact 
solution and to replace the original, for instance, three-dimensional problem 
by a two-dimensional one. It is, however, clear that the solution of the new, 
“reduced” problem will, in general, differ from the solution to the original 
high-dimensional problem. Thus, the dimension reduction method unavoid- 
ably produces the error that can be referred to as the dimension reduction 
or the modelling error. The essential part of the model verification is, hence, 
a reliable a posteriori control of the dimension reduction error. 

Despite the practical importance of the topic, only a few a posteriori esti- 
mators for the dimension reduction error have been introduced so far. In [10] 
and [2] (see also [1]) residual- type estimators were proposed and proved reli- 
able and efficient under the assumptions that the right-hand side of the given 
equation is zero and the original domain is a plate with plane parallel faces. In 
[3] and [8] implicit estimators based on the solution of local Neumann prob- 
lems were developed; the estimators were intended for hierarchical modelling 
and involved the solution of local three-dimensional problems. 

In this work we propose a reliable and efficient a posteriori estimator for the 
dimension reduction error in the energy norm, having no specific assumptions 
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on the right-hand side of the given equation and considering a general geome- 
try of the given domain. We show that, for the zero-order dimension reduction 
method considered here, the estimator of Babuska and Schwab (see [1], [2]) 
can be obtained as a particular case of our estimator when the right-hand side 
of the equation is zero and the original domain is a plate with plane parallel 
faces. We demonstrate the optimal convergence of the estimator as the plate 
thickness tends to zero (although, it is worth noting that the proposed estima- 
tor preserves its reliability for any positive thickness). Finally, we observe how 
accurately the estimator indicates the local error distribution, thus, allowing 
for a local improvement of the model. 



2 Problem setting 

We consider a three-dimensional Lipschitz domain 



i? := {x G I (xi,X 2 ) 6 i7, dQ{xi,X 2 ) < X 3 < d^{xi,X 2 )} , 

where i? C R^ is its projection on the (xi, X 2 )-plane {Q has the Lipschitz 
boundary F) and d^ and d^ are Lipschitz continuous functions: i? — > R. The 
lower and upper faces of i? are denoted by 

Fq := {x G R^ I (xi,X 2 ) G i? , X 3 = dQ{xi,X2)} 

and 

r® := {x e R® I {xi,X2) e /?, Xz= c?e(a;i,a;2)} , 

the lateral boundary by 

To := {a; e R® I (a;i,a:2) € F , d.Q{xi,X2) < X3 < d(^{xi,X2)} 

(see Figure 1). 

Remark. We consider de and d^ as explicit functions of (xi, X 2 )-coordinates 
only for the sake of simplicity. The generalization of the theory to the case of an 
arbitrary Lipschitzian domain Q presents no difficulty from the conceptional 
point of view. 

The assumption that the given domain i? is “thin” can now be written as 

diam i? nmxd (xi, X 2 ) , (1) 

d 

where d = d^—dQ is the domain thickness, d (xi, X 2 ) > d* > 0 V(xi, X 2 ) G Q. 
Although the assumption is of purely qualitative nature, it serves as a basis 
for the derivation of the corresponding two-dimensional reduced model. We 
also have to notice that Figure 1 depicts a simplified case; in the geometrical 
definitions we do not assume the domain thickness d (xi, X 2 ) to be a constant. 
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Fig. 1. Sketch of the domain geometry 
In the domain i? we consider a model elliptic problem 



— Div (AVu) = / in 12 , (2) 

u = 0 on /b , (3) 

AVw -uq=Fq on Tq , ( 4) 

AVu • i/® = F® on 7® , (5) 



where / G L2(i?), Fq G L2{^)^ and are outward normal vectors 
at Fq and F^ respectively. The matrix A = {^ij{^))ijz=T^ with the compo- 
nents from Loo{Q) is symmetric and uniformly positive definite, i.e. there exist 
constants 0 < c < C < oo such that 

< A{x)^ • ^ V(^ G , a. e. in i? . 

From now on we will frequently use the notation x — (xi, ^2), x = (x, 2:3), and 
all functions depending only on (xi,X2) will be marked by ; in addition, we 
will distinguish between 3- and 2- dimensional divergence operator: 

Divr = Ti4 + T2,2 + T3^3 , divf = f + r2,2 • 
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The weak form of the problem (2)-(5) reads 

Problem (P): Find u e Vq := {v e | f = 0 on Pq} such that 



/ 

Ja 



AVu • Vw dx 



/ fwdx-{- FQwds-{- F^wds \/w G Vq . (6 ) 

J r2 J i~Q J jT0 



3 The reduced problem 



The assumption (1) allows one to suppose that 

the exact solution u ^ const in the xa-direction. (7) 

This gives rise to the so-called zero- order reduced model for the original prob- 
lem (6). The model is very popular due to its simplicity and purely two- 
dimensional formulation. The discussion on the hierarchy of the reduced mod- 
els of different orders can be found in, e.g., [9], [2]. 

Then, introducing the subspace 

Fo ^ K) I 3?; G Hq{Q) such that v{x) = v{x) for a.e. x = (x^xs) G O} 

and the operation (^) of averaging in the X3 -direction 

(^) 

Vfi( € Li{n) : g{x) := 

a [X) 

dQ{x) 

we can deduce from (6) the reduced problem (see [7]) that reads 
Problem (P): Find u eVo such that 



/ 



g{x^xs)dxs for . 



G 12 , 



/ d{x)Ap{x)Vu 'Vid dx = / d(x)f{x)wdx Vu; G Vq , 

Ja Jh 



where f f + 



Fey/i+IVc^eP+FeVi+IVciel^ 



and Ap(x) == {cHj{x))ij^ 



: 1,2 



(8) 



the 



averaged “plane” part {Ap{x) = {cnj{x))^ -^y^) matrix A. 

It is clear that problem (8) is a two-dimensional elliptic problem with the 
homogeneous Dirichlet boundary condition: 



-div (d{x) Ap{x)Vu) = d{x) f{x) in i? 

= 0 on P . 



(9) 

( 10 ) 



4 A posteriori estimation of the modelling error 

In order to control the dimension reduction error e := u — u, we apply the 
functional- type a posteriori error estimate derived in [6] (see also [4] and [5]) 
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to the original three-dimensional problem (6): 

For all 7 > 0, (^ > 0 and y* G Div) there holds 



|« - 2|r < (1 + 7) + ( 1 + - j (1 + 5 )Cf, Mi 

+ (1 + ^J ^l + iVr(l + C?a)M|, 



( 11 ) 



1 /“? 

where ||| • ||| is the energy norm, lilt'll! := (f^ A(x)Vv • \/vdx) Vt^ G Vb, 



Ca is the constant from Friedrichs’ inequality 

Ml 2 (rg^) + \\^\\L 



iiiGVb\{0} 



^2(m' 



\ L2(rQ) > 



is the constant from the trace inequality (Cp = sup 1112,11 112 

t^GVb\{0} +ll^llL2(n) 

and the functionals , M| , M| are defined as follows: 



Mf [ (ViI-A-^y*).(AVii-y*)dx, 

Jq 

Mi := ||Divj/*+/|||^(^2), 

Mi ■■= ll^e -2/*«^ellL(re) + ll^e - y**^ellL(r®) ■ 

We emphasize that the estimate is valid for any positive numbers 7 and S and 
for any vector-function y* from the space i7*(i?,Div) defined as 



iJ*(i?,Div) 

{y* G L2(i?,M^) I Divy* G , y* • G L2(Fq) , y* • 1/9 G L2(F0)} . 

While the best possible option would be to take as y* the exact flux AVu 
(then M2 and M3 would vanish and M\ would give us the energy norm of 
the exact error e), we have to restrict ourselves to choosing some computable 
quantity, i.e. not containing the unknown exact solution u. We approximate 
the flux by _ 

y* = ApVu + r*, (12) 

where r* = {0 , 0 , '0(x)}^ , 1/1 is the auxiliary function from L2{0) such that 
'0,3 ^ ^2(1^) , '0 ^ ^2(^0) and V' G L2{r@). Using (9), it is easy to verify that 
y* from (12) belongs to iJ*(i7,Div). A discussion about other choices of y* 
can be found in [7]. 

Substituting (12) into the functionals Mf, M|, M|, we obtain (see the 
details in [7]) 
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Ml 

Ml 

Mi 



[ + 2(b3 • ApVu)V') dx + [ d{x) (BpAp - I)Vti • A, 

Jn Jn 



(13) 



Vudx , 



(14) 



■ - / - ^e^l + IV^el^ + J^ex/l + IVcfeP + ^ 3 _ 



3 - • ^P^^\\L2(f2) ^ 

(15) 

2 I II r A V7;rr J,.. I|2 



IlFe - ApVS-i^e -^i/e3|lL2(re) + \\F^~ApVu’U^ - , 



where Bp is the averaged “plane” part of the matrix B := A ^ (i.e., if B(x) = 
(.^ij{^))i,j=U 5 . then Bp(a;) = {bij{x))ij=i;2)^ the vector bs := {631 , &32>^ and 
I is the 2x2 identity-matrix. 

Now we still have the freedom of choosing the auxiliary function -0 that in 
the case of the Poisson equation should, obviously, approximate the derivative 
u^s of the exact solution in the xa-direction. The simplest choice is to take such 
a 7/; that the term M3 (i.e. the residual on the Neumann boundary condition) 
would be identically zero. This can be immediately achieved by letting '0(x) = 
a(x) xs+p(x) with the coefficient functions a /l 3 E L2{Q) uniquely determined 
by the requirement M3 = 0. Other options for the function are considered 
in [7]. Then, minimizing the right-hand side of (11) with respect to the scalar 
parameters 7 > 0 and > 0, we arrive at the estimate 



\\\u-^\\<M :^Mi+CoM 2 , (16) 

where M\ and M2 are defined by (13) and (14). The error majorant M has 
been derived for quite general geometry of Q and coefficient matrix A(x). 
However, to make the estimate more transparent, we consider two particular 
cases. 



4.1 Plate of constant thickness 

We assume that 

c?0 = de + do (do == const > 0) (17) 

and, in addition, that 



A = A(x) (this immediately implies B = B (^)). 



(18) 



031 = 032 = 0 (this yields Bp = A“ ^ , 633 = a'^i , 631 = 632 = 0) . (19) 
With these assumptions the terms M\ and M2 in estimate (16) become simpler; 

\ 1/2 



Ml = (^ OglV" dx^ , M 2 = 11/ - fh.^o) 



( 20 ) 
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One may notice that the integral in the first term Mi of the error majorant 
M can be rewritten as 

f dx = do ■ f S^(^d^ + de) + ^) dx , 

JQ Jn 

which means that the term Mi is of order when the plate thickness do 

tends to zero. If / G Loo(>0), the second term M2 is obviously of the same order 
(P(dy^), i.e. the whole estimator M converges to zero with the rate 
as do ^ 0. This is the optimal convergence rate for the modelling error e in 
the energy norm, as was shown in [9] for the simpler case of a plate with plane 
parallel faces and / = 0. It is worth noting that, if / G the second 

term in M is of higher order 0{dQ^‘^) as compared to the first term. 



4.2 Plate with plane parallel faces 

If, in addition to (18), (19), we strengthen the assumption (17) replacing it by 

d® = ^ , dQ = -y {do = const > 0) , (21) 

the auxiliary function '0 will take the simple form 0 = ^3 H — — 2—" 

and the error estimate (16) will read 

|||«-q|| < ^J^ (^j^a^i{Fl + Fl-F<^Fe)dij'\ca\\f-J\\L,(m (22) 

If we set here f = 0, ass = 1 and = Fq = F, we obtain 

ll^llLafn) ’ 

which is exactly the estimator of Babuska and Schwab (see [1]) for the zero- 
order reduced model. Thus, the latter estimator can be obtained as a particular 
case of the error majorant (16) if one makes the assumptions (18), (19), (21) 
and sets f = 0. This is a particularly interesting fact, since we advocate the 
estimation approach that is completely different from the one utilized in [1] 
(see the details in [7] and [6]). 




5 Numerical example 

In order to analyse the performance of the proposed error estimator, we con- 
sider a simple two-dimensional test problem in the “sine-shape” domain (see 
Figure 2 (left)) whose upper and lower faces are given by 
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de,ei^) = sin{kTTx)±Y , A; = 1,2,..., 

where do > 0 is the domain thickness. In this example, O = (O? 1) ^ = 

{{x,y) \ X e Q ^ dQ(x) < y < d^{x)}. The considered problem is 

—Au = f in i? , 

u = 0 3it X = 0 and x = 1 , 

• z/0,© = Fe,e at y = d©,e , 

and the right-hand sides of the equation and of the boundary condition are 
computed using the exact solution 

u{x, y) = sin(7Tx) • y^ (m = 1, 2, . . .) . 

The reduced problem (8) is, in this case, a one- dimensional Dirichlet problem 
that, of course, can be solved very accurately (in the present work, we address 
the estimation of the modelling error only, assuming that the discretization 
error stemming from the solution of the reduced problem is negligible) . 

Figure 2 (right) shows the convergence rates of the exact modelling-error in 
the energy norm (||le|||) and of the error majorant M as the domain thickness 
do tends to zero. It is clear that both the exact error and the majorant converge 
to zero with the theoretically predicted, optimal rate 0{dQ' ), and, moreover, 
the effectivity index demonstrates the asymptotics = 1 + O{do). It 
is also important to note that the presented error estimator provides a reliable 
upper bound for the exact error at any positive values of the domain thickness 
do^ i.e. also in the cases when the domain is not “thin” at all. 

Finally, the local error distribution provided by the exact error and by the 
first. Ml -term of the majorant are depicted in Figure 3. The figure shows that 
already for rather large value of the domain thickness do = 0.1 the majorant 
delivers a sufficiently accurate information on the location of the regions of the 
biggest modelling error, while for do = 0.05 the exact and the estimated error 
distributions are practically coincident. 
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Summary. The coupled Stokes and Darcy flows problem is solved by locally con- 
servative numerical methods. Discontinuous Galerkin methods are used in the Stokes 
region and discontinuous Galerkin methods coupled with mixed flnite element meth- 
ods are employed in the Darcy region. Optimal a priori error estimates are derived. 



1 Introduction and Model Problem 

In this work, a numerical method for solving the coupled problem of Stokes 
and Darcy equations is formulated and analyzed. The coupled system arises 
from the study of the interaction between surface and subsurface flow. Discon- 
tinuous flnite elements and mixed flnite elements (MFE) are used in subregions 
of the subsurface domain while only discontinuous flnite elements are used in 
the surface domain. While mixed elements are popular and efficient on regular 
grids, discontinuous Galerkin (DG) methods are accurate and easily imple- 
ment able on highly unstructured meshes. The proposed method enables to 
take advantage of one of these locally conservative methods in a particular 
subregion of the subsurface. This work is an extension of the coupling of DG 
for Stokes and MFE for Darcy [8], and DG for Stokes and Darcy [7]. Similar 
couplings are studied in the work of Layton, Schieweck and Yotov [5] and Dis- 
cacciati and Quarteroni [2, 3]. Let i? be a domain in IR^, subdivided into three 
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Fig. 1. Gomputational domain 



subdomains and f?s. Denote by Fij the interface between and Qj 

for i < j (see figure 1). Define also the boundary Fi = for 2 = 1,2, 3. 

The velocity Ui (resp. pressure pi) denotes the restriction of the fluid velocity 
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u (resp. pressure p) to the subdomain We assume that the fluid satisfies 
the Stokes equations in i?i. 

-V • (2/x£)(ui) -pil) = /i, in i?i, (1) 

V • = 0, in /?!■ (2) 

Here, is an external force acting on the fluid, /i > 0 is the constant fluid 
viscosity and D{u) = + is the strain tensor. Let n denote the unit 

outward normal vector to the boundary dQ. The single phase flow problem is 
solved on the region i?2 L) i?3. 

V 'Ui = fi, Ui = -KVpi, in 2 = 2, 3. (3) 

The permeability tensor K is symmetric, positive definite, bounded below and 
above uniformly. As boundary conditions, we consider u = 0 on the Stokes 
boundary Ti, and KVp - n — 0 on the Darcy boundary /2 U /s. At the 
interface, the transmissibility conditions arise from the mass conservation, the 
balance of forces across each interface, and the Beaver-Joseph-Saffman law 
[1, 9]. Denoting by riij (resp. Tij) the normal (resp. tangential) unit vector to 



the interface for 2 < j, the conditions are written as; 

ui • ni 2 = -KWp 2 • ni 2 , on A 2 , (4) 

'^1 • == '^3 • '^^13, on Ti 3 , (5) 

U 3 • ri23 = -KVp 2 • ri23, P2=P3, on T23, (6) 

pi-2p.{D{ui)nii)’nii)=pi, on Th, 2 = 2,3, (7) 

ui -Tii = -2G{D{ui)nii) ' Tii, on 2 = 2,3. (8) 



As the pressure is unique up to an additive constant, we assume that f^p = 0. 
A weak solution (u^p) of the coupled Stokes-Darcy equations exists (see proof 
in [5]). We assume that it is also a strong solution with enough regularity. 
Section 2 defines the discrete spaces and the numerical method. Section 3 
contains the error analysis. 



2 Numerical Method 

The discontinuous Galerkin method is used in the region i?i U i?2, while the 
mixed finite element method is used in the region j?3. For 2 = 1, 2, 3, let be 
a non- degenerate quasi-uniform subdivision of let be the set of interior 
edges and let h denote the maximum diameter of elements. We assume that 
the meshes match at the interface F13 match, but they may not match at the 
interfaces F12 and F23. Given a fixed normal vector ng, on each interior edge, 
pointing from El to , the average {u;} and jump [u;] of function w is defined: 

{w} = ^{w\ei) + ^{w\e 2J, [w] = {w\Ei)-iw\E2), ye = dElndEl 
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On a boundary edge, the jump and average coincide with the trace of the 
function. For any integer A: > 0, the Sobolev space on a domain O is denoted 
by H^{0) = {ve L‘^{0) : D^v e L^{0), V|m| < k}, with norm || • \\k^o- We 
denote by Lq{0) the space of square-integrable functions with zero average, 
with inner-product 

Let /ci, /c 2 and ks be positive integers. The DG discrete spaces are 

= VE€£l v^eilPkAE))^}, 

Ml = {q, € L\n 2 ) : \/Ee£l, 91 G 

Ml = {q2€ 1^(02): '^Ee£l q2elPkAE)}. 



The MFE discrete spaces are the classical ones, such as the Ravi art- Thomas 
spaces [6]. We assume that the mixed velocity spaces X\ d H{div; Qs) con- 
tains polynomials of degree ks and the pressure spaces C L‘^{f2s) polyno- 
mials of degree ks — 1. We associate to these spaces the following norms: 



Eesl eer^uri 



^1,6 

\e\ 






eeri2 



I|92||L= E l|V92||g,ij+ E 






11^311^3 = ||«3|lU- lkl|lL: = lkl|lU, IkallLa = 119311^,1,3- 



Here, |e| denotes the measure of each edge e, the parameters <7i,e and a 2 ,e are 
positive constants defined later. Throughout the paper, C denotes a generic 
positive constant whose is independent of the mesh size h. We now define the 
global finite element spaces. 



Xh = {v = {vi,V 3 ) : Vi G XI, yrj£Xl- ni 3 / r]{vi - V 3 ) ■ ni 3 = 0}, 

(9) 

Mh = {q = iqi,q2,q3)-qi&Ml, q€Ll(Q)}. ( 10 ) 

We recall a result proved in [4] that generalize a Sobolev imbedding. There 
exists a constant C independent of h such that 

Vui G Wq2 € Ml, l|ui||o,i,i < Clluillxi, ||92||o,i22 < C'lkzIlM^- 

( 11 ) 

We introduce the following bilinear forms ai:X^xX^— >]R, M^ — > 

IR, 02 : X ]R, U 3 : X^ X X^ ^ ]R and bz : Xl x Ml ^ IR: 
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ai(ui,vi) = 2iJ, f D{ui) : D{vi) + n ^ ^ /[«!] ' bi] 

eerluFi 

— 2/i ^ / {Vume} ■ [wi] + 2h€i ^ j {VviUe} ■ [til] 

eGr^uri eer^uri 

+ § Z] + ^ /(Wl •Ti3)(Wi -Tis), 

/piV-Vl+ ^ /{pi}[t»l] -Tie, 

BS£fc eer^'uA 

«2(P2,92) = E / • ^92 + E T7 / [^’2][92] 

.sr. I'l ■'• 

~ E / • ne}[92] + «2 E / {'^^92 • ne}[p2]- 

esr,^ sen 

az(U3,V3)= / K~^U3-V3, 63 (^ 2 , 92 )= / 93V-V3- 

J i^3 J 



By introducing the parameters 61,62 that take the value 1 or — 1 , we allow 
for non-symmetric or symmetric bilinear forms ai and a 2 . We assume that 
in the non-symmetric case (61 = 1 ), the parameter ai^e is equal to 1 and 
in the symmetric case (61 = — 1 ), the parameter a\^e is bounded below by 
a sufficiently large positive value. The same assumptions hold true for 62 and 
(J 2 ,e- Combining the subdomains bilinear forms, we define 



A{u,p-,v,q)=ai{ui,Vi) + a2{p2,q2)+a3{u3,V3), '^u,veXh, '^p,qeMh, 

( 12 ) 

B(v,q) = bi(vi,qi) -b 3 {v 3 ,q 3 ), \/v e Xh, ^q £ Mh- (13) 



It is easy to show that there is a coercivity constant k > 0 such that 

A{v, q\ V, q) > k(||vi||xi + Il«3|lx3 + il92|lM2), Vv e Xh, V? € Mh- (14) 

Finally, we define a bilinear form A : {Xh x Mh) x {Xh x Mh) IR acting on 
the interfaces F 12 and ^23- 

A{u,p\ v,q) = / {p 2 Vi ■ ni 2 - q 2 Ui ■ nu) + / ( 92 W 3 • '>^23 - P 2 V 3 ■ ri23)- 

'J A2 As 

With these forms, the numerical method is: find {U, P) € Xh x Mh such that 



V(v, q)eXhX Mh, A{U, p- V, q) + B{v, P) + A{U, P- v, q) = (15) 

= (/i,vi)r2j + (/2,92)r?2. 

yq£Mh, B{U,q) = ~{f3,q3)n^. 



( 16 ) 
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Lemma 1. If {u,p) is the strong solution of the coupled Stokes-Darcy flow 
problem then {u,p) satisfy the variational equations: for all (v^q) G 

Xh X Mh 



A{u,p;v,q) + B{v,p) + A{u,p-,v,q) = (17) 

= (/i,vi)f 2 i + (/ 2 , 92)^2 - / P 3 (vi -V3) -riis, 

B{u,q) = ~{f 3 ,q 3 )o,. (18) 



Proof. Multiplying the Stokes equation (1) by e Xl, integrating by parts 
over one element E, summing over all elements in and using the regularity 
of the strong solution yields; 



Y2 j {‘^pD{ui) \ D{vi) - Pi^ ■ Vi) 

I + ‘2p-D{ui)}rie ■ [vi] + e X! / {‘^pB>{vi)}ne ■ [«!] 

- {-Pil + ‘^pB>{ui))ni 2 ■ vi 

eSAjUAs 

^ /(-pi J + 2/nD(ui))n • ui + € ^ I 2pD{vi)n-ui= j /j • ui. 

^ 7-1 J c ^ r^ «/ e J f 2 \ 






eeri'^^ een 

Using the interface conditions (7), (8), we obtain 



ai(«i,vi) +^>i(vi,Pi) + P 2 Vi-ni 2 +^ P3V1 -nis = {f^,vi)ni- 

r' d e r' d e 



eeri 2 



eGAs ' 



Similarly, multiplying (3) by a test function q 2 G M^, integrating by parts 
on one element, summing over all elements in and using the boundary 
condition yields: 

0^2{P2,q2) - / KVp2 • ri23q2 + / KVp2 • Tii2q2 = (/2,g2)r?2- 
d As d F\2 

With the conditions (4) and (6), we obtain 

a2{P2,q2) - E •«1292+ XI / • W2392 = (/2,92)f22- 

eSA2 e€A3 

Rewriting (3) as K~^us = — Vps, we easily obtain by multiplying by V 3 € X^, 
integrating by parts and using (6): 
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Finally, the regularity of u and a simple integration of (3) give: 

h{ui,qi) = 0 , h{us,q3) = (/ 3 ,®)^ 23 - 

The final result is obtained by adding the previous variational equations. 

We now recall some approximation properties satisfied by the spaces Xh and 
Mh [4, 8]. Given i; G U Gs))^, there exists v e Xh such that 

yqeMh, B{v-v,q)= 0 , (19) 

Ve€r^'uri, Vq € (JPfc,_i(e))2, J[v,].q = 0, (20) 

VeeAzUTis, Vq e (Ffc,_i(e))2, j(v^-vi)-q = Q, (21) 

Ve e T 23 , VqeX^-n 23 , ^(^3 - ^s) • n23»? = 0. (22) 

If in addition, v\ G G the approximation v 

satisfies 



||v-t;||o,X2i < + + ||v - + ^ ^ 

" ( 23 ) 

||v - vllx^ < C7/i''"+Vlfc3+i.r23- 

Define also the projection p of the pressure p. If pi G we have 

WqeMl yEeSl 2 = 1,2,3, [ q{p-p) = 0, (24) 

J E 

m = 0, 1, 2-1,2, 3, Up, - piWm^a, < . (25) 

Lemma 2. The discrete solution to (15), (16) exists and is unique. 

Proof. In a finite-dimensional setting, it suffices to show that the solution is 
unique. Set /, — 0 and (v, q) = (C7, P) in (15), (16). This yields A{U, P; C7, P)+ 
A{U , P;U^ P) = 0. Since A{v,q;v,q) = 0, we are left with ai(C7i,t/i) -f 
« 2 (P 2 , P 2 ) A-a 3 {Us, Us) = 0. This clearly implies that Ui = Us = 0 , and that 
P 2 is a global constant over i? 2 . 

We now define 



and consider P G L^(i?i U i7a) such that P = P P. Then the equation (15) 
becomes 

Vi; G B{v, P) + (P 2 - P){ [ vi-ni 2 - [ vs - 1 x 23 ) = 0. (26) 

d Ei 2 d E23 
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Since P G Lq{Qi U i? 3 ), there is a function z G {Hq{Qi U such that 

— V ' z = P. Denote by z the approximation of z satisfying (19)-(22), and 
choose V = z in (26). With property (19) and the regularity of i:, we obtain 

\\P\\la,um=B{z,P) = Biz,P)=0. 

The equation (26), with P = 0 becomes: 

(P2 — P)( [ '^1 • '^'12 — [ V3 ‘ ri 23 ) = 0 , \/v e Xh- 

J F\2 ^23 

Since P belongs to Lq{Q)^ the constants P 2 and P are related by 
\O2\P2P{mp\n3\)P = 0. 

These two equations imply that P 2 = P = 0, which concludes the proof. 

We now finish this section with some trace and inverse inequalities needed for 
the analysis. Let P be a mesh element with diameter Let be a positive 
integer. Then, there exists a constant C independent of He such that 

V<l>elPk{E), yecdE, ||</>||o,e < (27) 

W4>elPk{E), Ve c 5E, ||V</> • Tiello.e < (28) 

3 Error estimates 

Theorem 1. Let (u^p) he the solution of the coupled problem (l)-(8) such 
that u\f 2 i € (iJ^""^^(i7i))^ for i = 1,3, p\f 2 i € for i = 1,3, and 

p\f 22 ^ Then the discrete solution (U,P) of {15), (16) satisfies the 

following estimate 

11^1 + 11^2 -V2\\mI + Ws -UsWxl < Ch^i(ln|/c, + i,i7, + |p|/ci,i7i) 

-{-Ch^^\p\k2 + l,f22 + C'/l^^(|lx|fc3 + l,X23 + bUaji^a)- 



Proof Define u the approximation of u satisfying the properties (19)-(23) and 
p the approximation of p satisfying (24)-(25) Define x — U — u and ^ = P — p 
and Xii^i their restrictions to the subdomains Subtracting (17) and (18) 
from (15) and (16), choosing Vi = Xi^Q 2 = ^2 and Vs = Xs and using the 
coercivity result (14), we obtain 

«(l|Xlllx' + IIC^IIm" + lixsilx?) ^ ai(«i -«i,Xi) +fl2(P2 -P2,6) 

+03(1x3 - ixs.Xs) + h{Xi,Pi-Pi) - h{Xz,Pi-Pz) - h{ui-ui,^i) 

+63 (1x3 -*3,6) + / (P2-p2)Xl • '^12 - / 6(«l-«l) • ni2 

J Fi 2 d Fi2 

+ i 2 {uz-Uz) ■ ri 2 Z - ip 2 -P 2 )Xz ■ n 2 Z + I PziXi-Xz) ■ 
d F 23 d Fi3 
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The pressure term bs{xsTPs —ps) vanishes because of (24). The terms bi{ui — 
^ 1 ,^ 1 ) and ^ 3 (^X 3 — Ust^s) vanish because of property (19) of the approxima- 
tion u. We now bound the remaining terms. By using Cauchy- Schwarz, trace 
inequalities, and the approximation results (23) and (25), we can bound the 
first three terms 

ai{ui - ill, Xi) + 02 (P 2 - P2, 6) + 03 (tt 3 - *3, X 3 ) < IIXi llx^ 

+ gllX3llx3 + + 

Because of the projection, the first pressure term is reduced and bounded 
as follows: 

hl{Xl,Pl-Pl) = 51 /{Pl -PlKXl] 

eeriuA 

The remaining terms are the interface terms. Using the approximation result 
(25), the trace inequality (27) and Cauchy- Schwarz’s inequality, we obtain 

[ (P 2 - P 2 )Xi • ni 2 < C\\xi\\o,nih^'‘\p\k2+i,n2 < IWXiWxl + 

J T\2 ^ 

Similarly, 

[ {P2 - P2)xz ■ ri2Z < 1||X3|Im3 + 

-'As ° 

With the bound ( 11 ) and the approximation result (23), we have 

[ 6(«1 -Wl) -ni2 < \U2\\m 2 + n^■ 

The approximation result (23), the inverse inequality (27) and the bound (11) 
give 



/ 6('*^3-t^3) • ^23 

J As 

Define p% the projection of p^ with respect to the inner product on the 
edge e. Since x belongs to Xh and by definition of the projection, we have 
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L 



PsiXi -Xs) 



= Y1 (P3 - PDiXi - X 3 ) ■ ni3 = J2 (P3 - pDXi ■ ni3. 

eGAa eGAs 

Assume that each edge e of As is shared by the elements and £ S^. 

Define the constant Cg = |e|“^ X 3 ' '^is- 

Pt)Xl • ni3 = / (^3 - P3)(Xl • ”■13 - Ce) 

T' C 7~' 6 



eeri3 * 



<Ch''^ ^ b3U3,EfllVxillo.Bi<^ll|VxlllU+^^'''1^’3li3,n3- 

eGAa 

The theorem is obtained by combining all bounds. 

Theorem 2. Under the assumptions of Theorem 1 and the additional assump- 
tion that for any element E G the edge dEnF 23 belongs to only one element 
of the approximation P satisfies the error estimate 

\\P -p\\o,Q < + bUi,r?i) 

+Ch^^\p\k 2 +i,Q 2 + C'h^^(|tt|fc3+i,r?3 4- 

Proof. Subtracting (17) from (15), the error equation is 

Vw € Xh, = ai(ui - f7i,ui) + as(w3 - t/s.^s) 



+ 



/ {p2-P2)vi-ni2- {P2 - P 2 )v 3 ■ ri 23 + I Pz{v i - V 3) ■ rii3 

d F\2 d As d F\2 



(29) 



Define ^ = (|12i| + If^sl) ^ Aur? 3 ^ function ^ € Lq{Qi U Q 3 ) by 

e = e-e 

B{v,^)=B{v,i) + B{v,Cj = B{v,i)-i{l vi-ni 2 - f V3-«23)- 

d F \2 d F23 



Let V G (iJo(i7i U i^s))^ such that —V • v = ^ and ^ 

C'll^llo i 7 iui? 3 - Choose in (29) v — the approximation of {), defined by (19)- 
(23): ’ 

lb1lAiUf23 =ai(wi -D'i,fi^) + 03 ( 1 * 3 - 173 ,^ 3 ) 

+ / (P 2 - P 2 )Vi ■ ni 2 - (P 2 - P 2 )V 2 ■ U 23 + Pz{v\ ~ V^) ■ ni 3 . 

d F\2 d F23 d F\2 
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The additional assumption for the meshes at 723 is needed to bound (p2 — 
P 2 )i >3 • 7123 . Using the fact that ■ 7x23 = 0 for e = dE n P23 and E G 

and the inequality || 7 >^||xi + < C'||^||o,fiiun3, we obtain: 

||e1lo.ri.ur^3 <U||71i-Ui|Ui+C||TX3-173||x3+C|b2-P2||M^+Uh2"lP3|^3,^23. 

We now bound ||^||o, 1220^3 = fd'Uil + \Qz\Y^‘^- Since P G Ll{Q), we have 

|f?l| + |f23| + 

<C\\P2-P2\\xi+Ch>^^\p2\k,+I,a,. 

Combining the results of Theorem 1 with the bounds above, give the optimal 
error bound. 

As a concluding remark, one can introduce Lagrange multipliers on the inter- 
faces, and obtain an equivalent method. This allows the decoupling of each 
subdomain, which may be advantageous for a parallel implementation. 
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Summary. A nonsymmetric discontinuous Galerkin method with interior penal- 
ties is considered for convection-diffusion problems with parabolic layers. On an 
anisotropic mesh with bilinear elements we prove error estimates (uniformly in the 
perturbation parameter) in an integral norm associated with this method. On dif- 
ferent types of interelement edges we derive the values of discontinuity-penalization 
parameters. Numerical experiments support the theoretical results. 



1 Introduction 

Consider the following boundary value problem with homogeneous Dirichlet 
boundary conditions 

( —sAu biUx cu = f in O — (0, 1)^ , 

{ M = 0 on r = an. 

Here 0 < £ <C 1 represents a perturbation parameter and 6i, c and / are 
real- valued functions defined on Q. Assuming 

> A > 0, c(x)>0, c(x) - i ^(a;) > 7 > 0, a; € 0,(2) 

the problem (1) belongs to the class of convection-diffusion problems whose 
solutions beside an exponential layer exhibit also parabolic layers. 

In contrast to convection-diffusion problems with only exponential layers, 
not much is known concerning sharp estimates for derivatives of u. However, 
for the problem (1) with bi{x) = 1 and c{x) = 0, in [10] the following result 
was proved using elliptic decompositions (in the corresponding result from [13] 
more compatibility is required): 

Theorem 1. If bi{x) = 1, c(x) = 0 and f E C^’^(f^), for some a G (0,1), 
satisfies the compatibility condition /(0,0) = /(0, 1) = /(1,0) = /(1,1) = 0, 
then the solution u of the boundary value problem (1) can be decomposed as 
u S El -\- E 2 Es, where for all (x, y) G O and 0 < i j <3 
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dx^dy^ 

dx^dy^ 

dx'^dy^ 



j{x,y) 

{x,y) 

(x,y) 



<C, 



d^+^Ei 



dx'^dy^ 



i{x,y) 






< 



C £-^'/2 I'g 



ly/V^ _l_ g-7(i-j/)/\/£ 



). 



< g-7(i-y)/%/£^ 



with a constant 7 > 0. 

A diverse nature of the vector field b = (61 , 0) produces a complete change 
in the behaviour of the boundary layers. According to Theorem 1, beside the 
regular boundary layer of exponential type at the outflow x — there are 
also parabolic layers placed near the sides y = 0 and y == 1. In the sequel we 
assume that the solution decomposition from the previous theorem exists also 
for the more general problem (l)-(2). 

The only result known so far for finite element methods (FEMs) on layer- 
adapted meshes for the problem (1) is the estimate 

s^/^\u-Uh\m^n)<CN-HnN, 

from [10] for the Galerkin FEM with linear or bilinear elements on a Shishkin 
mesh (see also the survey paper [6]). Surprisingly, up to now there is no result 
in the literature for the streamline-diffusion finite element method (SDFEM). 
Furthermore, for the SDFEM the optimal choice of the stabilization parameter 
near parabolic layers is an open task, see discussion in [4] and [5]. This was 
also a reason for us to investigate some alternative discretization technique. 
Here, for numerical solving of (l)-(2) we use the h- version of a nonsymmet- 
ric discontinuous Galerkin finite element method with interior penalties (the 
NIPG method), [2], [7], [8], [9]. The technique from [2] is applied, but with 
a bilinear interpolant in error splitting instead of an L^-projection onto a fi- 
nite element space. This allows us to use the well-known interpolation error 
estimates for the problem (l)-(2). We Anally show that on a specially chosen 
shape-irregular mesh (Shishkin mesh), this method yields error bound that 
is uniform in the perturbation parameter. Since our discretization involves 
the layer-adapted mesh whose construction directly uses information from the 
solution decomposition, the technique for proving ^-uniform error estimates 
from this paper cannot be applied on more general (nonrectangular) domains 
Q, when such a decomposition is not available. 



2 The nonsymmetric discontinuous Galerkin method 

Following [2] and the notation therein, let T be a general partitioning of the 
domain ft = (0, 1)^ consisting of disjoint open axiparallel rectangles k such 
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that Q = U/tGT contrast to [2] we allow anisotropic (shape-irregular) 
meshes, but assume that there are no hanging nodes. 

The broken Sobolev space of composite order s = {s^ : k. e T} is defined 
with G L^(n) : v\^ G V/^ G T}. Let us assume that 

each K G T is an affine image of a fixed reference element k = (—1,1)^, i.e. 
K = Tk(/^). Then the finite element space is 

S{n,T,F) = {v e L\n) : vUoF, gQi(«)} , 



where F = {F^ : k € T} and Qi{k) is the space of bilinear functions defined 
on k. 

Let 8 be the set of all open one-dimensional element interfaces associated 
with T, and Si^t C S he the set of all edges e £ S contained in fl. Also, 
let Tint = € O : X £ e for some e £ £’int}- Then for each e £ we 

define the jump and the mean value of a function v £ across e by 

[v]e = v\d,^r\e - v\dK.'ne and {v)e = (r’la^ne + /2, respectively. Here e is 

a common edge for elements k and /^', and dn denotes the union of all open 
edges of tz. With each e £ we associate the unit normal vector v pointing 
from a: to if e C r, we take v to be the unit outward normal vector /i. We also 
define the inflow and outflow parts of dn by d-K — {x £ dn: b(x)-/i^(x) < 0}, 
d^K = {x £ dtz \ b(a;) • /i/c(x) > 0}, respectively, where /i/^(x) represents the 
unit outward normal vector to 8k at the point x £ 3 k. 

For any element k £ T and v £ we denote by the interior trace 

of on 3k. In the case 3-K,\T ^ 0, for some G T, for each x £ 3-K.\T there 
exists a unique k' £ T such that x £ 3 ^k\ Now for a function v £ 
and for some k £ T with the property \ T ^ 0, we define the outer trace 
v~ of V on 3 -k\T as the inner trace such that 3^k' H {3-k\ T) ^ 0. The 
jump of V across 3-k\T is defined by [v\^ = — v~ . In order to simplify 

the notation, in the sequel we omit indices in the terms [v]e, {v)e and [v\k- 

Now, the weak formulation of (1) that corresponds to the NIPG method 
reads 

find Uh G 5(Q,T, F) such that 
B{uh,Vh) = ^ 

where the bilinear form B is given by 



I 



fvh dx , for all Vh G 5(fi, T, F) , 



( 3 ) 



)w dx 



B{y^ w) = (s Vu • Vw dx + / (b • \/v + cv)i 

+ / avw ds-\- cr[u][u;] ds 

Jr JVint 

+ ^ \ — (b • /J.)v'^w'^ ds — (b • fj.^)[v\w'^ ds | 

\ Jd-Knr Jd-K,\r J 

-\-£ / {v(Vw • /i) — (Vt’ • fJ^)w) ds -\- £ {[v]{Vw • ly) — (Vv • iy)[w]) ds , 

Jr Jvint 
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for v,w e T). Here a is called the discontinuity-penalization parameter, 

and is defined by cr|e = CTg, e G where de is a nonnegative constant. In the 
sequel we shall present the exact choices of (jg, for all edges e ^ S. 

As in [ 2 ], assuming that u G T) and that u and Vu • v are continuous 

across each interior edge e, we obtain that B satisfies the Galerkin orthogo- 
nality property. Also, the bilinear form allows one to introduce the so-called 
DG-norm 

veH\n,T). 

In [3] the authors proved the existence and uniqueness of the solution Uh of the 
discrete problem (3). All error estimates in that paper and in [ 2 ] are derived 
in the DG-norm. 

Specifying the result from [ 2 ] to the problem ( 1 ) on a shape-regular mesh 
of maximum element diameter h, we obtain 

11^* - w/iIIdg < C {eh^ + h^ + h‘^) ||w||ff2(n) , (4) 

with the choice (jg = ^h~^, where he represents the length of an edge e G S. 
In general, the estimate (4) is useless when 5 — > 0 , see Theorem 1 . Therefore, 
we use an a priori constructed layer-adapted mesh (Shishkin mesh) and show 
that on such a partitioning robust convergence is guaranteed. For simplicity, 
we use the standard conforming Shishkin mesh and avoid hanging nodes. 



3 The discretization mesh and the interpolation error 

Discretization mesh. For the discretization of the boundary value problem ( 1 ), 
here we use an anisotropic tensor-product Shishkin with (N + 1 ) x {N + 1 ) 
mesh nodes, that is adapted to the layers at x = 1, y = 0 and y = 1. Let N 
be an integer divisible by 4 and let and Xy be mesh transition parameters 
defined by 



fl 


e , .^1 


^ r 1 




min X 1 


,2—liiN )■ , 


Ay =mm|-, 2 ^ 


/ - In iV ^ 


12’ 


A / 


/ 7 J 



where f3i is the lower bound for the function bi and 7 is a constant from 
the solution decomposition (Theorem 1 ). The domain ft is split into Q = 
Qil U Qi 2 U ^^21 u ^^ 22 , with 

= [1 — Ax, 1] X ([0, Xy] U [1 — Xy, 1]) , O12 = [1 — Ax, 1] X [Ay, 1 — Ay] , 

O21 = [0, 1 - Ax] X ([0, Ay] U [1 - Ay, 1]) , O22 = [0, 1 - Ax] X [Ay, 1 - Ay] . 



Then the intervals [ 0, 1 — Ax] and [1 — Ax, 1] are uniformly dissected into N/2 
subintervals to give the mesh in ^-direction, while for Qy we dissect [ 0 , Ay] 
and [1 — Ay, 1 ] into N/A subintervals and [Ay, 1 — Ay] into N/2 subintervals. 
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Taking the tensor of and Qy ^ we obtain our final rectangular Shishkin 
mesh. 

On such a constructed partitioning we need to introduce several types of 
edges (depending on the type, we later determine the discontinuity-penalization 
parameter (Jg for each edge e G S). The edges of type I belong to the set 
(1 — Aa;,l] X ([0,Ay)U(l — A^/,!]). They are part of the layer region (more pre- 
cisely, they lie in the corner layers) and their length is either = 2\xN~^ or 
hy = A\yN~^. Assuming A^c = 2£//?ilnA’, \y = 2^/eJ^hiN and £ < CN~^^ 
we have := 2(1 — \x)N~^ and hy <C Hy := 2(1 — 2Xy)N~^. 

Therefore we describe these edges as short; other edges are called long. 
Edges of type II are also short and belong to the rest of the layer region 
^12 U 1121 • Since this region contains also long edges, we refer to them as 
being of the third t ype. Type III also contains long edges near the layers 
from the set — ^22 \^22- Finally, edges of type IV are long and lie in 
^22 — [0, 1 ~ Aa; — Hx] X [Ay -f- Hy^ 1 — Ay — Hy\. 

Interpolation error. As it was already stated, in the error analysis instead 
of an L^-projection onto a finite element space, here we shall use a bilinear 
interpolant of u that vanishes on T. In order to obtain an ^-uniform estimate 
for ||ix — r^/illDG, we shall use various bounds on interpolation error rj = u — u^ . 
For example, in [11] one can find L^- and L°^-estimates of 77 

Mma) < CN-^ , hllL~(?^) < 111' N , < CN~^ , 

while [14] and [15] contain results for V 77 . All these results hold under the 
assumption ^/£ \v? N < C and they are proved using the decomposition from 
Theorem 1 and the technique from [1]. 

In the following section we present some of the key points of the error analy- 
sis for the NIPG method on the Shishkin mesh when applied to the problem (1). 
More details can be found in [14] and in the forthcoming paper [15]. 



4 Error analysis 

We start the error analysis by first introducing the error decomposition u — 
= V V T] = u — ^ ^ — Uh. The final estimate for \\u — 7i/i||DG is 

further obtained from the triangle inequality. 

First notice that from the Galerkin orthogonality property one has ||^||dg ~ 
Analyzing each term in with the technique from [2], we con- 

clude that II^IIdg can be estimated by terms that depend on the interpolation 
error 77, data functions, £ and the mesh. Collecting these expressions and the 
expressions from ||77 ||dg, we conclude that in the final error bound \\u — Uh\\DG 
the following terms are to be estimated: 
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First it can be proved that, [14], 

h <CN-HuN, I2<CN~^ . 



The treatment of the terms /s, / 4 , . . . , Jr depends on the type of edge; more 
precisely, we look for the contribution from each edge to these terms. Let 
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Table 1. Different types of edges and corresponding parameters cJe 



edge type 


parameter (Je 




7, II horizontal 


^N/lnN 




1, 1 1 vertical 


JV/lnJV 




III horizontal 


N (ec 


^/sN/lIlN (otherwise) 


III vertical 


N (ec nia). 


N/lnN (otherwise) 


IV horizontal 


sN 




IV vertical 


eN 





Ij,lh and Ijjy denote the contributions of edges of the type /, //, III 
and IV to the terms Ij^ j = 3, 4, . . . , 7, respectively. Assuming that the values 
of the discontinuity-penalization parameters are the same for all edges of the 
same type, then from L°°-interpolation error estimates for rj and Vry we have 



h,i+h,i <Cs^N~hn^ N , 
l3,u + h,ii < CeiN-iln^N, 
h,ui + l6,iu<CN-iln^ N, 

-? 3 ,iv + -? 6 ,iv < , 



l4,i+h,i+h,i<CsiN-Hn'^N, 

h,n + h,u + l7,u < CsiN-^ In^ N , 

■^4,111 + 4,111 + 4, III < CN~^ ln2 N , 
4, IV + / 5 , 1 V + 4, IV < Cs2 . 



The choices of the discontinuity-penalization parameter <Je are summarized in 
Table 1. 

We proceed with the term Is which reduces to 



4 = 


1 bi{dxOv da; 


< 


[ h{dxOil da; 


+ 


f hidxOv da; 




K.eT 




J rill ur2i2 




J ri2i uri22 



The first term in /« can be estimated with 



[ hidxOv da; 

J r2i 1 UQt 2 



< 



iiuni2) iuni2) 



< In^ 4|| In^ ATH^Hdg , 

while for the second term we have 



[ hidxOv da; 
J 0.21 UQ 22 



< C'||5x^||i2(Q2iun22)II^IU^r22iun22) 



< < CiV-i||e||DG. 



Thus, 
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/8<CiV-'||^||DG. 

At the end, it can be proved, [14], 

Ig < CN-^ , ho < CN-^ , 

In < CN-'^ In^ N, lu < In^ N , 

In < CN-^ In^ N, lu < In^ N . 

Collecting previously given estimates for /i, /2 , . . . , / 14, we observe that 
the long edges from the layer region (type III edges) produce an error of the 
highest order. Therefore, the main result for the NIPG method on anisotropic 
Shishkin mesh for the convection-diffusion problem (1) with regular and 
parabolic layers reads 

Theorem 2. Let u be a solution of the convection-diffusion problem (1) and 
let Uh be a solution of the discrete problem (3) on the Shishkin mesh. Assuming 
N < C and (2), and choosing the penalty parameter as in Table 1, we 

have 

\\u-Uh\\ioG<CN-Hn^/^N . 



5 Numerical experiments 

In the sequel we experimentally verify the theoretical result from Theorem 2. 
We test the discontinuous Galerkin finite element method (3) on the aniso- 
tropic Shishkin mesh when applied to the problem (1) with bi{x) = c(x) = 1. 
The right-hand side / is chosen such that the function 

u{x,y) = a; (1 - (l - (l - 

is the exact solution. 

Table 2 presents the maximum values for e — 10~^,...,10“^ of different 
norms of the error Ch = u — Uh. The first column corresponds to the DG-norm 
1 1 6/1 1 1 DG 5 while the second corresponds to an estimate of the maximum norm 
of the error, denoted by We compute this estimate using an auxiliary 

mesh that contains 25 uniformly distributed mesh points per element. We also 
compute the values of the L^-norm of the error ||e/il|x,2(Q), as well as the L°°- 
norm of the jumps ||['U/i]||oo along the edges. We observe that the numerical 
results for the DG-norm of the error Ch are better than those predicted in 
Theorem 2. These results also indicate second-order accuracy of the jumps 
along interior edges and 3/2 as the order of convergence of the L^-norm of the 
error. 

For this test problem, the NIPG method on anisotropic Shishkin mesh is 
inferior in ||e^||(f^oo compared to the bilinear Galekin FEM and bilinear SDFEM 
(with the streamline-diffusion parameters as in [12], or in [5]). Nevertheless, 




744 H.-G. Roos, H. Zarin 



Table 2. The NIPG method on the Shishkin mesh for the test problem 





IkhllDG 


||e^||d,c 


X) 


\Wh\\L^(Q) 


II Mile 


X3 


N 


error 


rate 


error 


rate 


error 


rate 


error 


rate 


8 


2.200(-l) 


0.850 


3.465(-l) 


0.572 


5.099(-3) 


0.762 


2.113(-1) 


1.301 


16 


1.220(-1) 


1.135 


2.332(-l) 


0.759 


o 

o 

cd 


1.016 


8.576(-2) 


1.725 


32 


5.555(-2) 


1.346 


1.378(-1) 


0.954 


00 
■T— 1 


1.299 


2.594^2) 


1.996 


64 


2.185^2) 


1.488 


7.115(-2) 


1.200 


6.044(-4) 


1.485 


6.504^3) 


2.133 


128 


7.788(-3) 


- 


3.097(-2) 


- 


2.159(-4) 


- 


1.483(-3) 


- 



it can be expected that this method, or a more general DGFEM, will exceed 
the streamline-diffusion FEM for the problems with more complicated layer 
structures where the flexibility of the DGFEM with respect to the mesh is 
useful. 
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Summary. A finite- volume scheme using penta-/hexagonal (PH) grids is presented 
for the shallow water model on the sphere. The irregular structure of the PH grid 
presents new challenges, e.g. for the calculation of energy gradients. Radial basis 
functions (RBFs) are employed for the accurate and efficient approximation of these 
values. The resulting algorithm is shown to be mass- and vorticity-conserving, and 
initial numerical results are presented. 



1 Introduction 

We present a mass- and vorticity-conserving finite- volume algorithm to solve 
the shallow- water equations on the sphere (SWES), where the sphere is 
spanned by a grid of pentagonal and hexagonal cells (hereafter referred to 
as a PH grid). 

The vector-invariant formulation of the SWES is 



g + V.(v.)=0 


(1) 


+ (C + /)k X V + V(k + ^) = 0, 


(2) 



where v is the velocity field, C == k - (V x v) is the relative vorticity, f = 2uj sin 9 
the Coriolis parameter {u is the angular velocity of the earth, 6 the latitude), 
k the vertical unit vector, h the depth of the fluid (directly proportional to 
the fluid mass in the cell), k the kinetic energy, and ^ gh the potential. 
rj — f IS also referred to as the absolute vorticity. 

This paper is an extension of work by Lin and Rood [10] who proposed 
a mass- and vorticity-conserving algorithm for a standard orthogonal latitude- 
longitude (lat-lon) grid. While [10] has proved successful in atmospheric mod- 
eling, it makes assumptions about the orthogonality of the underlying grid. 
Such grids have a singularity at the poles, which must be dealt with specially. 
On PH grids, which have no pole problem, the algorithm is not immediately 
applicable. 

A short overview of PH grids is given in Section 2. In Section 3 it is then 
shown that the inherent limitation in [10] to orthogonal grids lies only in its 
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advection algorithm. Assuming the availability of an appropriate advection al- 
gorithm for PH grids, a related formulation exists for general non-orthogonal 
grids. One problem with such grids, however, is the additional difficulty cal- 
culating the gradients in equation (2). A technique for calculating these to 
higher order is described. First numerical results of tests proposed in [14] are 
presented in Section 4. 

Although only the two-dimensional SWES are discussed here, the ultimate 
goal of our research is to create a three-dimensional solver for atmospheric 
dynamics. A technique for extending the 2D SWES to the 3D problem is 
treated in [8] and will not be treated further here. 



2 Overview of Penta-/Hexagonal Grids 

The use of penta- /hexagonal grids - which cover the sphere with twelve spheri- 
cal pentagons and an arbitrary number of hexagons - for atmospheric modeling 
is not new. A summary of finite-difference methods on such grids was given by 
Williamson in [13]. A milestone in the use of PH grids was achieved by Heikes 
and Randall [4] who built much of the foundation for a genuine General Cir- 
culation Model (GCM). Baumgartner, Majewski, et al. [6] built a production 
model based on these grids which now provides numerical weather forecasts 
at the German Weather Service (DWD). 




Fig. 1. The standard lat-lon grid (left) illustrates the Mercator projection of surface 
coordinates to the rectangular domain [— 7r/2,pz/2] x [— 7r,7r]. The PH grid (right)' 
instead decomposes the sphere into 12 spherical pentagonal cells and the rest into 
hexagonal cells. 



A PH grid can be obtained by first distributing n points over the globe 
in a relatively even manner. A common method, used in [4, 6] among others, 
is to start with an icosahedron (consisting of 20 equilateral triangles). Each 
triangle is then subdivided recursively into 4 equilateral triangles until the 
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desired resolution is obtained. This procedure gives rise to an icosahedral grid. 
Heikes et al. [4] use roughly the same technique although the grid is twisted 
to maintain a northern/southern hemisphere symmetry. From this grid a PH 
grid can be constructed, e.g. by considering the perpendicular bisectors of 
each triangle’s edge. In this case, the cells form the Voronoi complement grid 
of the icosahedral grid. The Voronoi cell Ck is the set of points on the sphere 
equidistant or closer to triangle vertex k than any other. 

Icosahedral grids are not the only basis for the class of PH grids. Given 
a non-degenerate set of points on the sphere, the spherical convex hull can be 
constructed. The facets of this hull are triangles. Care must be taken with the 
choice of the triangle vertices so that the intersection (subsequently referred 
to as the cell vertex) of the three perpendicular bisectors will be inside the 
triangle. Around any given triangle vertex the cell vertices will form either 
pentagons or hexagons (not necessarily regular, but by their construction con- 
vex). Finally, the surface is “inflated” to form a sphere (see Figure 1). Several 
issues then arise due to the spherical geometry which are described in [11] but 
will be passed over here for sake of brevity. 

A non-degenerate distribution of four or more points yields a convex hull 
with triangular faces. With twelve or more points, the Voronoi dual diagram 
will be a PH grid. The points can be evenly distributed by minimizing a poten- 
tial function, in which each point on the sphere can be considered as a point 
charge. An exhaustive study of such distributions has been made by Sloane, 
et al. [3]. Finally the cell centers can be deflned as the cell’s barycenter; there 
are numerical advantages for this choice. The barycenter does not necessarily 
coincide with the corresponding triangle vertex. 

Heikes and Randall [5] point out that even if the perpendicular bisector 
of the triangle edge is used, the midpoint of the chord joining the cell centers 
will not generally coincide with the midpoint of the chord connecting the cell 
vertices (cell edge midpoint or flux point ) . This fact leads to problems if simple 
flnite differences are used to calculate, for example, the gradient in the flux 
point, and can have a negative influence on the order of the algorithm. In 
[5] a revised method of placement for the triangle vertices is suggested which 
minimizes a (somewhat artificial) norm describing midpoint alignment. In [12], 
a more founded approach is taken: the vertices are conceptually connected by 
springs and allowed to adjust their positions by spring dynamics. The resulting 
configuration still has the advantages of the icosahedral grid, while improving 
the algorithm’s order. 

Several authors, e.g. [6, 13], have pointed out the advantages of PH grids 
for atmospheric modeling. In the first place, they avoid the “pole” problem 
of lat-lon grids, namely the convergence of the meridians at the poles (small 
grid cells can violate a CFL condition). Moreover, PH grid cells only have 
edge neighbors. That is, there is no pair of cells which share only a cell vertex. 
This allows a straightforward application of finite volumes, as will be seen 
subsequently. 
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3 Shallow Water Model on the Sphere with PH Grids 

By taking the curl and divergence of equation (2), respectively, we arrive at 
the vorticity /divergence form of the SWEs: 

^ + V-(v,7)=0 (3) 

or 

— + V • [7?k X v] + (k + ^) = 0. (4) 

An expression for the velocity can be found by solving elliptic equations to 
determine the stream function and velocity potential 



Vorticity: rj = A'lp with stream-function 'll; 

Divergence: 6 = Ax with velocity potential x 

Velocity: v = k x V'lp + Vx- 

This approach is taken in [6], for example, at considerable computational ex- 
pense, and it seems worthwhile to look for a more computationally expedient 
method. 

For lat-lon grids, an explicit mass- and vorticity- conserving shallow- water 
model was proposed in [10]. To derive this algorithm, equation (2) was formu- 
lated in lat-lon coordinates, thus inherently limiting it to orthogonal grids. The 
elegance of the method lies in the numerical treatment of the vector-invariant 
formulation under consideration of the vorticity equation (3), which implies 
that vorticity r] is merely advected. Thus a local constraint is imposed: local 
changes in the vorticity can only effect the region within the propagation zone 
of the advection. 

The explicit time stepping method for the grid cell Qij at the 2 , jth loca- 
tion, presented in [10], is: 



Ln+l _ Ln 



+ F{u% At] h^) -h G{y\At] h^) 

^n+1 = 






I 






K * +^"+1 
tO 



■e 



K * + ^^+1 



(5) 

( 6 ) 

(7) 



where F and G are flux- form operators in longitude and latitude, specifically 
the difference of incoming and outgoing fluxes F = ^j+i /2 — Fi-ij 2 and G = 

^ Q 

Sj-i-1/2 ~ Qj-i/ 2 \ ^a(-) and 5e{>) are grid differences, while the (.) and (.) are 
grid means, defined in the edge midpoints in A and 6 respectively. 
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The source terms X and y in equations (6) and (7) are the time-averaged 
fluxes for the finite- volume discretization of equation (2). Ignoring higher order 
terms, these can be approximated for some scalar quantity q'^ at time step n 
as: 






t-i-At 



u*qdt y{v'',A\q^) 



it 



t-\-At 



V* q dt. (8) 



The discrete, cell-averaged relative vorticity is defined as: 



0 = 



AAX cos 0 






1 

AA9 cos 6 



6e [u'^ cos 6 ] . 



( 9 ) 



Crucial to the algorithm is the use of staggered grids (C- and D- grids, 
described in [7]), on which the velocities are defined in the same point as the 
flux. Thus there is a one-to-one dependency between fluxes and velocities. If the 
fluxes depended on more than one velocity value, it would be necessary to solve 
for velocity from a system of equations. But in this case it is straightforward 
to calculate predictor values, u* and v* (on a C-grid) which are then inserted 
into the corrector step, on a D-grid, in (5), (6) and (7). As this technique is 
discussed at length in [10], we will pass over this topic. 

There is some additional complexity at the pole, where a pole cap cell (a reg- 
ular polygon with m = I'k j A\ vertices) needs to be considered separately. The 
poles are also treated in a way which conserves mass and vorticity. 

After some analysis of the algorithm in [10], several key features present 
themselves. The global conservation of discrete mass follows directly from equa- 
tion (1). With some calculation (see [11]), it can be shown that the global dis- 
crete vorticity is conserved as well. Furthermore, orthogonality is inherently as- 
sumed in the algorithm, particularly in its use of a flux-form Semi-Lagrangian 
(FFSL) advection scheme presented in [9]. FFSL is a finite- volume scheme 
which splits the time-averaged fluxes of the generic scalar quantity q into an 
east- west X and north-south y. This splitting can only realistically be applied 
on an orthogonal grid. 

Our goal is to obtain a similar algorithm to solve the SWES which is not 
limited to orthogonal grids, nor even to grids which are orthogonal to their dual 
grid (i.e. possess the Voronoi-Delaunay property). To this end it is necessary 
to introduce nomenclature which is sufficiently general for PH grids. 
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Qi cell with index i 

fi Coriolis parameter in f2i 

Mi index set of neighboring cells to 

fluid depth (“mass”) in cell Qi at time step n 
discrete absolute vorticity in cell f2i at time step n 
li^k length of edge {i, k} 

Ui^k outward vector perpendicular to edge {i, /c} 

ti^k counter-clockwise unit vector along edge {z, k} 

“C-grid” wind speed J_ to edge {i,k} 

“D-grid” wind speed || to edge {i,k} 
w, At) flux of quantity q with wind w for At through edge {z, k} 
discrete approx, to gradient of q at flux-point {z, k} 



Using this notation it is possible to formulate expressions for the discrete vor- 
ticity and divergence, which are simply generalizations of the formulas in [10]: 



r]i 



'^Wi 



Z 






fi — 



'^keAfi 






(10) 



These are second order if the cell center is the barycenter. 

At this point the one-to-one correspondence with [10] is lost. In [10], the C- 
and D-grid winds (corresponding to u; J ^ and ^ ) are interchanged by two 
averagings. This only works if the grid is orthogonal, thus for non-orthogonal 
grids another approach must be found. In [1] a linear reconstruction (a Raviart- 
Thomas element of order 0) for the wind inside the cell is made, thus retaining 
only the edge-parallel winds as unknowns. However there is a limitation: the 
coefficients of the linearization can only be determined in the case of a trian- 
gular grid. Unfortunately, for PH grids the linear system is over-deflned, and 
higher order approximation would have to be employed. 

We propose instead to work with both icj . ^ and ^ , treating each 
equally. In order to do this, we need the help of equation (4) to determine 
The predictor-corrector character of algorithm in [10] is retained to 
provide higher order in time. This gives rise to the following half-time up- 
dates: 



= K- At) (11) 



Inserting the above updates into the deflnitions for discrete vorticity, we And: 
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"" 1^ 5! ~ ^t) + G\U,, («;+^)}) h,k + fi 

Y1 - Y1 G\U^,{K + ^)li,k\ 



\a 

= v?- 



keNi 

At 

\^i\ 



I keNi 



keNi 






At 



I a 






keNi 



Y.vr^\Qi\ = Y.v7m 

Vi Vi 

Y (•^fc(’7i> + G\\i.k + ^))kk ■ 

Vi keNi 

0 

Similarly, we arrive at the anticipated form for the updated divergence: 



- loj E -^11“^.^- ^^)kk - E (« + ^)kk- 

' keNi I keNi 

A clear hurdle in the above PH grid updates are the calculations of the 
gradients Q. Not only should these be as accurate as possible, they should also 
satisfy the constraint that the discrete curl of namely CG^, is identically 
0. Given a generic scalar cell mean quantities it is possible to find a higher 
order expression for the edge gradients [11] at the flux point. For example, one 
can define a spherical radial basis function (SRBF, see [2]) defined by a local 
set of qi clustered around the flux point: 



N 

q'(x) = ^ AfcT(r,-) + Ajv+i + (Ajv+2, . . . , Ajv+d+i)^x with rj = ||x-Xj||, 

i=i 

(14) 

where T(r) is a radial basis function, e.g. T(r) = r^logr/167r for thin-plate 
splines^ and the A are found such that ^(xj) = qj. If one considers the subset 
qj from cells centered on (and including) cell Qi and uses the resulting SRBF 
to approximate all the edge-parallel gradients of the central cell, the discrete 
curl-gradient CGi disappears: 



CGi{q)= Y,{Vq-t)i,kkk 



keNi 

N 



^ ^ ^ ^ ^ ^i,kh,k H“('^N+2 5 -^AT+s) * ^ ^i,kh^k — 0- 



i=i 



keNi 



keNi 



0 




A Finite- Volume Mass- and Vorticity-Conserving Shallow- Water Model 753 

But the use of this gradient approximation will not produce consistent gra- 
dients between neighboring cells, which is a requirement. On the other hand, 
it is possible to use different SRBFs q^^\x) — each built from a subset 
clustered around a flux point {i,k} — to approximate the gradients. In this 
case there is only one consistent gradient value per flux point. However, CGi 
does not necessarily disappear. 



CGi{q)= ^ (VgW • 

k€Mi 



= E (EAr 

keMi \j=i 




+ (A 



(/c) 

iV+2’ 




7 ^ 0 in general. 



SRBFs offer a ffexible interpolation method with scattered data and do 
not require specific grid qualities, such as the orthogonality of the grid with its 
dual (the Voronoi-Delaunay property). If a higher order approximation, e.g. 
with SRBFs, for q in each cell vertices is used, then a second order gradient 
approximation can be found which fulfills CGi ^ 0 and is consistent between 
cells. 



= 

CGi{q) = 

where p{i, k) and s{i^k) are the indices of the preceding and successive neighbor 
cells of i and k around i in counter-clockwise fashion. The final formulation of 
the predictor is 






Some approximation of vertex value shared by cells i, j and ^5) 

(16) 

(17) 



h 



i,k 



^ ^i,k,s{i,k) ^i,k,p{i,k) — ^5 

keNi 



f n+1/2 



h? 



^ At 

7 2^ 



2\Q, 



keNi 



TG = - T {nkirii,v^i^i,k, ^) + a,i,,(« +^)} 

- Y y) + (« + ^) } 



n+1/2 

^ I 
-Li.fc 



with the time- averaged kinetic and potential energies, 
K* = • v"+l/2) 



(18) 

(19) 

( 20 ) 

( 21 ) 



The corrector step is 
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Kt' = ‘a - if; E 

' keNi 






( 22 ) 



<tt = {^k(Vi, At) + («* + ^*)} . (24) 



Thus for PH grids, the SWES problem, as in [10], can be completely de- 
termined from gradient approximations and advective fluxes across cell 
boundaries. Fluxes between vertex neighbors are avoided since these do not 
exist on PH grids. In this paper the flux determination is considered a “black 
box” : the advection problem on PH grids is far from simple, and finding a sec- 
ond order monotone method for solving it is a matter of our current research. 



4 Numerical Tests and Discussion 

A prototype of the suggested algorithm was programmed in Matlab; later it will 
be programmed in Fortran 90. We employ the tests suggested in [14], in partic- 
ular the Cosine Bell and Rossby-Haurwitz tests. For the initial tests, a simple 
first-order, hence rather diffusive, advection algorithm was implemented. The 
gradient approximation is performed using SRBF approximations and vertex 
differences from equation (16). The results for the cosine bell rotation are 
illustrated in Figures 2 and 3. 
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Fig. 2. The Cosine-Bell test from [14] is ap- 
plied to a grid containing 4482 cells. After 
one full rotation in approx. 600 hrs., the peak 
has diffused considerable, but retains its basic 
form. 
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Ratio 
of peak 


Ratio 
of min 


0 
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60 
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0.96 
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0.90 
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0.88 
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0.85 
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0.92 
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0.90 
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0.44 


0.89 
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0.41 


0.95 


540 


0.39 


0.96 


600 


0.38 


0.99 



Fig. 3. For the test in Fig. 2, 
the table contains the ratios, as 
a function of time, of the max- 
ima (i.e. cone peak) and min- 
ima to the starting values 
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Summary. In this contribution, we discuss parallelization of the problem of curve 
dynamics in plane. Related PDEs are based on the levelset method introduced in 
[5], and on the phase-field method described in [1]. Numerical schemes use a finite- 
difference discretization in space and explicit time solvers. Parallel algorithms are 
designed for systems with distributed memory, and are based on the domain splitting. 
The achieved results indicate strength and efficiency of the described approach in case 
of such highly nonlinear problems. 



1 Mean-curvature flow 

We study the following motion law for closed planar curves denoted as F: 



vr — —g{0)Kr + F, 



( 1 ) 



in the direction of the Euclidean normal vector to F. Here, np denotes the 
normal vector to F,vp the normal velocity, Kp the mean curvature, F a forcing 
term, and ^ is a suitable positive 27r-periodic function of curve anisotropy, 9 is 
the angle between rip and a prescribed direction. We take g{9) — 'ip{9)-\-'ip"{9), 
where '0(^) = 1 + C cos{Nfoid C is the anisotropy strength and Nfoid a type 
of symmetry. The equation (1) in the form of the Gibbs-Thompson law is 
contained in the modified Stefan problem. For details, we refer the reader to 
[1]. In [4], we may find an application in noise filtering, edge detection and 
morphing of computer-processed image data. 

Hamilton- Jacobi equation. Assume that the curve F{t) is represented by 
a levelset of a function P = P{t,x), i.e., F(t) = {x G | P{t,x) = const.}. 
We can express the quantities appearing in (1) by means of P: 



Up = - 



VP 

|VP|’ 



Vr 



dtP 

|VP|’ 



Kp = div{np). 



Then, we can introduce the Hamilton- Jacobi equation (see [5, 3]) 



dt 



:flW|VP|V-(^) + |VP|P. 



( 2 ) 
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Allen- Cahn equation. An extensive experience with non-linear reaction- 
diffusion equations led to the development of a phase-field approximation of 
(1) by the Allen-Cahn equation [2], or by a modified Allen-Cahn equation [1]. 
The evolution of the levelset ^ of its solution approximates the evolution of 
the manifold r{t), as discussed in [1]. 

First, we denote a rectangular domain i? = (0,Li) x (0,L2) C [x^y] G 
the time variable t G (0,T). The problem for an unknown function p — 
p{t^ X, y) reads as follows 

^/o(p)) +F(«)$|Vp|, in (0,T) X Q, 

'P\dQ = 0 on (0,T) X dQ, = Pini(x) in 

Here, ^ > 0 is a parameter related to the thickness of the interface layer (it is 
usually set to a value << 1). The polynomial fo(p) — np(l — p){p — with 
a > 0 is derived from the double- well potential wq as Wq = —fo. The function 
F = F{x^y) is bounded. The function pini is an initial condition. We refer the 
reader to [1], for details concerning the equation and physical background of 
it. 



Numerical schemes. We treat the PDF problems (2) and (3), both closely 
related to (1), by several numerical schemes implemented by means of par- 
allelization tools for the systems with distributed memory. The problems are 
solved in a spatial domain Q = (0, Li) x (0, L 2 ), which is discretized by a rect- 
angular uniform grid with mesh sizes /12 in directions x and y. 

We introduce the following notations for a given function u: 



hi = ^> h-2 = Uij =u{ihi,jh2), 

Wh = {[ihi,jh2] \ i = l,...,Ni-l; j = 1, . . . , N 2 - 1}, 

u>h = {[ihi,jh2] I i = 0, . . ATi; j = 0, . . . , ^^ 2 }, 7h = Wh - Wh, 



hi 

Uij Uj j — \ 
Uy^ij = 



Xx,tJ 



hi 



= 



= ho ’ = 






h2 ’ h2 ’ 2/12 

'^xx,ij — T , '^yy,ij ~ + l — l) ? 

^hU ^ [Ux,Uy], VhU= [Uo, 'Uo], AhU = + Uyy. 

Direct discretization of the levelset equation. The curvature expressed 
in terms of second-order derivatives 



{dyPf - 2 d^yP d,P dyP + dlyP {d,Pf 

{{d^py + {dyPyy/^ 
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allows us to use central differences to approximate both first- and second-order 
derivatives. We then propose an explicit scheme in the following form (n is the 
time-level index, r is the time step): 



pn+l 



P?i+rg{e) 






-^P&°yP&P^ + PyyiPi)" 

(Po)2 + (Po)2 



+ r^{Piy + {P^yrP, 



which is subject of a regularization when {P°)^ + (^5)^ — 0- The relationship 
of r and h is given by a stability condition. 

The equation (1) defines the motion law on r{t) only. On the other hand, 
the function P is obtained from the equation (2) valid in 17. In our work, we ex- 
tend the forcing term F from (1) to (2) as it is. Other extensions (construction 
of extension velocities) are discussed, e.g. in [6]. 

Discretization of the regularized levelset equation. Let e > 0 be a small 
regularization parameter. Instead of (2), we solve the following problem: 



dP 



-4- = + |VP|2 div , , 

dt \ ^£2 + |VP|2 



VP 



+ P\/£2 + |VP|2, in (0,T) X Q, 



dn 



= 0 on (0,T) X 517, P\t=o ~ Pini{x) in 17. 



dn 



It can be approximated using the following explicit nine-point-stencil finite- 
difference scheme: 



Pt+^=PP+rQiV,PP)iPP 



+ 



pk 



pk 



pk 



pk 



h^QiVP^J 



^2Q(VPVi) 

2 > 

+ Pt )], 

i+i,j y,iJ 



where Q(u,v) = ^e^+u^ + y2, VPf^._ . = i(P^ 

= lPLphiPlij + P^^i_,j)h and evaluated ana- 

logically. The scheme is only conditionally stable. 

Discretization for the Allen- Cahn equation is derived by spatial finite 
differences. Nodal values then remain functions of time, for which we obtain 
a system of DDEs (the semi-discrete scheme) in the following form: 



+ fo{p^)) + ^‘^\^hp'^\F on a>h, 
P'*l7h=0> P^{0)=Pini- 



The equations are numerically solved by the Runge-Kutta-Mersn 4-th order 
method with adaptive time step. The scheme has been analyzed in [1] from 
the convergence viewpoint. 
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2 Parallelization techniques 

The above described algorithms are parallelized by means of the Message Pass- 
ing Library MPI both using Fortran 77/90 and C programming languages. 
Computations using MPI, version 1.1 were performed on the supercomput- 
ing systems IBM SP3 and Cray T3E at CINECA^, IBM SP, IBM SP2 at the 
Czech Technical University in Prague, and computations using LAM MPI li- 
brary^ were performed on a local network of Linux PC workstations at the 
Czech Technical University in Prague. In both approaches described below, 
the computational task is performed by one or more processes, each of them 
running either on a separate processor (a hardware unit, or virtual unit in the 
emulated mode). 

Cartesian domain splitting is an approach where a rectangular domain 
i? is decomposed into rectangular subdomains, each of them treated by one 
process. Boundaries of subdomains overlap by one grid line, on which they 
exchange data. The amount of communication between processes depends on 
the blocking strategy. We tested the row-wise blocking strategy^ where the 
domain is decomposed row- wise. Each block interacts with neighbouring blocks 
during a timestep. The other tested strategy was the chequerboard blocking. 
In this case, each block communicates with maximum eight neighbours during 
a timestep. 





Fig. 1. Cartesian domain splitting (left), and narrow-band splitting (right) 



Narrow-band technique introduced in [6] explores the fact that we are in- 
terested only in the evolution of the curve r{t). It is therefore enough to follow 
the evolution of P = P{t, x) in the vicinity of the levelset P(t). The presented 
approach provides a significant speedup. On the other hand, it is less accurate 
and more difficult to implement, because it requires a reconstruction of the 
narrow band when P approaches its edge (the operation is called reinitializa- 
tion). In our implementation, we cover the curve by overlapping squares of 
a constant width which are assigned to processes in an intuitive way (Fig. 1). 

^ Super computing Center of Italian Universities, Bologna 

^ LAM MPI, Local Area Multicomputer is an open source implementation of MPI 
standard, http : / /www . lam-mpi . org 
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For example, in case of 64 covering squares and 16 processes, the first four 
squares are computed by the first process, the second four squares by the sec- 
ond process etc. Consequently, the narrow band created by such squares is 
not of constant width. The processes exchange data for all nodes, where the 
squares overlap. The approach is easy to implement including processing of the 
grid by parts small enough to fit them in the fast cache memory of processors. 
For the purpose of algorithm evaluation, we define the following quantities: 

^ , run time in a single process __ Speedup 

Speedup = , Fn. = ^ . 

run time m n processes number of processes 

Speedup and efficiency of parallelization for the direct algorithm of 
the levelset equation - Study 1 (IBM SP). In this study, we consider 
a circle of the initial radius Rq = 1.35 placed in a domain (0,4) x (0,4), 
which shrinks according to the law vp = —i^r (see Figure 2a). The domain 
is discretized with the mesh size 0.02 in both directions, the time step is r = 
4 • 10“^. Number of time steps is 22500, the computation stops right before 
the shrinking time T = 0.9 (see [1], ). The code is parallelized by means of 
the domain splitting. The results achieved on the IBM SP system are shown 
in Table 1. 





Fig. 2. (a) A circle in (0, 4) x (0, 4) shrinking from the initial radius Ro — 1.35 to the 
radius Rt = 0.15 according to the isotropy law vp = —np. (b) An initial circle of the 
radius Ro = 3.0 deforming itself according to the 5- folded anisotropy law described 
by Eq.(3), where Nfoid — 5 and ( == 0.025. 



Study 2 (IBM SP). The above given problem (see Figure 2a) was recom- 
puted using several choices of the mesh size and the time step. As it can be 
seen from Table 2, efficiency of parallelization depends on the size of data ex- 
changed between processes (e.g., it is faster to send 200kB of data than twice 
lOOkB, due to an initiation). 

Study 3 (IBM SP and Linux network). In this case, the initial condition 
(a circle with the initial radius Rq = 1.35) evolves according to (1) with F = 0, 
Nfoid = 5 and C = 0.025 as indicated in Figure 2b). With numerical parameters 
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Table 1. The results of parallelization efficiency on IBM SP. 



Number 

of 

processes 


Mesh nodes 
per 

process 


CPU time 
per 

process 


Mesh 

nodes 

total 


Communication 

mesh 

nodes 


Eff. 


1 


40000 


908 


40000 


0 


- 


4 


10000 


258 


40000 


400 (1.0%) 


88% 


8 


5000 


149 


40000 


800 (2.0%) 


76% 


12 


3333 


113 


40000 


1000 (2.5%) 


67% 


16 


2500 


94 


40000 


1200 (3.0%) 


60% 



Table 2. Efficiency depends on the mesh size. Computation performed on the IBM 
SP system (CPU time per process and efficiency). 



Mesh size 


200 X 200 


267 X 267 


400 X 400 


667 X 667 


Time step 
Iterations 


4.0 • 10"'’ 
22500 


2.3 • lO-'" 
40000 


1.0-10-'’ 

90000 


3.6 • 10-“ 
250000 



Mesh size \ Processes 


1 


4 


8 


12 


16 


200 X 200 


908 


258 (88%) 


149 (76%) 


113 (67%) 


94 (67%) 


267 X 267 


2585 


697 (93%) 


392 (83%) 


277 (78%) 


231 (70%) 


400 X 400 


14171 


3657 (97%) 


1915 (93%) 


1343 (88%) 


1058 (84%) 


667 X 667 


98904 


25574 (97%) 


12889 (96%) 


8740 (94%) 


6775 (91%) 



hi — h 2 = 0.01, T = 1.1‘ 10“^ and 64286 time levels, it terminates at t = 0.72. 
The curve is covered by squares 35 points wide. Due to the curve shrinking, 
number of active nodes in the narrow band decreases from ~ 33000 to 16000 as 
shown in Table 3. Compared to the domain splitting, efficiency is lower. This 
is caused by the fact, that the overlapping areas between processes are larger, 
and even the number of active nodes increases with the number of processes. 
On the other hand, the computation is faster, as only a part of the grid is 
active, and the absolute amount of exchanged data is smaller. 



Table 3. Narrow-band approach on IBM SP and Linux network applied to an 
anisotropic circle shrinking. 



Number 

of 

processes 


Min. no. 
of active 
nodes 


Avg. no. 
of active 
nodes 


Max. no. 
of active 
nodes 


Communication 
mesh nodes 
(% of Avg) 


CPU time 
per process 
(IBM SP) 


CPU time 
per process 
(linux cluster) 


1 


16226 


26390 


33268 


0 


15219 


2608 


4 


16366 


26407 


33252 


560 (2.1206%) 


5009 (76%) 


884 (74%) 


8 


16383 


26497 


33368 


1120 (4.2269%) 


3260 (58%) 


576 (57%) 


12 


16435 


26517 


33409 


1680 (6.3356%) 


2613 (49%) 


456 (48%) 


16 


16581 


26620 


33536 


2240 (8.4147%) 


2365 (40%) 


- 
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Speedup and efficiency of parallelization for the regularized levelset 
equation - Study 4 (IBM SP3). We considered the test problem shown in 
Figure 3. For a given number of processes, it is possible to split the domain 
either into NPROC rows, or into NPROCX columns and NPROCY rows in 
the checquerboard blocking {NPROCX x NPROCY = NPROC). Unless 
NPROC is a prime, there are several possibilities for selecting NPROCX and 
NPROCY. Tables 4 and 5 present the runtimes, numbers of communication 
nodes, and efficiencies for both of the above mentioned blocking strategies at- 
tained on an IBM SP3 machine. It is clear from Table 5 that the checquerboard 
blocking is superior to the row- wise blocking for higher numbers of processes, 
which is due to a lower number of communicating nodes. 



Table 4. Results of parallelization for the row- wise blocking. 



Number 

of 

processes 


Mesh nodes 
per 

process 


CPU time 
per 

process (s) 


Mesh 

nodes 

total 


Communication 

mesh 

nodes 


Eff. 


1 


360000 


2656 


360000 


0 


- 


4 


90000 


778 


360000 


1800(0.5000%) 


85% 


9 


40000 


307 


360000 


4800(1.3333%) 


96% 


16 


22500 


180 


360000 


9000(2.5000%) 


92% 


25 


14400 


145 


360000 


14400(4.0000%) 


73% 


36 


10000 


125 


360000 


21000(5.8333%) 


57% 



Table 5. Results of parallelization for the chequerboard blocking. 



Number 

of 

processes 


Mesh nodes 
per 

process 


CPU time 
per 

process (s) 


Mesh 

nodes 

total 


Communication 

mesh 

nodes 


Eff. 


1 


360000 


2656 


360000 


0 


- 


4 


90000 


684 


360000 


1201(0.3336%) 


97% 


9 


40000 


320 


360000 


2404(0.6677%) 


92% 


16 


22500 


171 


360000 


3609(1.0025%) 


'94% 


25 


14400 


122 


360000 


4816(1.3377%) 


87% 


36 


10000 


85 


360000 


6025(1.6736%) 


87% 



Speedup and efficiency of parallelization for the Allen- Cahn equation 
- Study 5 (IBM SP3). In this computation, we studied the isotropic curve 
evolution starting at a four-folded pattern in a spatial domain (0, 2) x (0, 2). The 
curve approaches the circle of radius R = 0.6 according to the law vp — — /^r + 
F{x) where the forcing F is a suitable radially symmetric and linear function. 
Other parameters are ^ = 0.01, h\ = h 2 = 0.00995. The curve evolution is in 
Figure 4(a). The domain was divided into 1, 4, and 16 rectangular subdomains. 




Parallel Computing Techniques 763 




Fig. 3. Evolution of the cardioida curve is driven by the regularized levelset equation, 
solved in the unit square with e = 10~®, grid 600 x 600, r = 10~^ and 20000 time 
levels 



and the computation was repeated with corresponding number of processes. 
The mesh size and the total number of mesh points remained the same, the 
number of mesh points per process decreased, the number of communication 
mesh points increased, both with increasing number of processes. Measurement 
results are in Table 6. 





Fig. 4. (a) 4-folded initial curves in (0, 2) x (0, 2) approaches circle of radius R = 0.6 
according to vr = —i^r + for radially symmetric linear F; ^ = 0.01, hi = 

h 2 = 0.00995. (b) 4-folded initial curve in (0, 2) x (0, 2) shrinks inside and expands 
outside of the circle of radius R — 0.6 according to vr = —g{0)Kr-\-F{x) for radially 
symmetric linear F, g(0) — 1.0 — 0.Scos{40 — tt/4); ^ — 0.02, hi = h 2 = 0.00995. 



Study 6 (CRAYT3E). The computation, performed on CRAY T3E, studies 
the anisotropic curve evolution starting at a four-folded leaf-like curve placed 
in a spatial domain (0,0.4) x (0,0.4). The curve shrinks inside a circle of 
radius Rq = 0.1, and expands outside of it thanks to a spatially dependent 
choice of F in the law vp = —g{0)tir + F g{0) = 1.0 — 0.Scos{4:6 — tt/4); 
^ = 0.02, hi = h 2 = 0.00995. The curve evolution is in Figure 4(b). The 
domain was divided into 1, 4, 16, 25 and 64 rectangular subdomains, and the 
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Table 6. Table of parameters for the use of IBM SP3 - Study 5. 



Number 

of 

processes 


Mesh elements 
per 

process 


CPU time 
per 

process 


Mesh 

elements 

total 


Communication 

mesh 

elements 


Eff. 


1 


40401 


118.11 


40401 


0 


- 


4 


10201 


29.89 


40401 


401(0.9925%) 


99% 


16 


2601 


10.01 


40401 


1197(2.9628%) 


74% 



computation was repeated with corresponding number of processes. Mesh size 
and total number of mesh points remained the same, number of mesh points 
per processes decreased, number of communication mesh points increased, both 
with increasing number of processes. Measurement results are in Table 7. 



Table 7. Table of parameters for the use of CRAY T3E - Study 6. 



Number 

of 

processes 


Mesh elements 
per 

process 


CPU time 
per 

process 


Mesh 

elements 

total 


Communication 

mesh 

elements 


Eff. 


1 


40401 


37.55 


40401 


0 


- 


4 


10201 


9.65 


40401 


401(0.99%) 


97% 


16 


2601 


3.23 


40401 


1197(2.96%) 


73% 


25 


1681 


2.48 


40401 


1592(3.94%) 


61% 


64 


625 


2.20 


40401 


2765(6.84%) 


27% 
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Summary. Results of the study of an elliptic 2D problem with a nonlinear New- 
ton boundary condition are presented. The problem is discretized with the use of 
the FEM and the integrals are evaluated by numerical quadratures. In the case of 
a nonpolygonal domain the main attention is paid to the effect of a piecewise linear 
approximation of the boundary. The error estimate for the solution of the discrete 
FE problem is derived. 



1 Introduction 

A number of problems in science and technology can be described by partial 
differential equations with a nonlinear Newton boundary condition, see e.g. [1], 
[8] , [5] and [7] . In this contribution we deal with a finite element approximation 
of one of such problems which can be met in the modelling of electrolysis of 
aluminium with the aid of the stream function. In this case the nonlinear 
term in the boundary condition has a “polynomial” behaviour and describes 
turbulent flow in a boundary layer. 

Let us introduce some notations which we will use later. Let G C 
be a bounded domain with a Lipschitz-continuous boundary dG. By G we 
denote the closure of G and by n the unit outward normal to dG. We 
use the well-known Lebesgue and Sobolev spaces L^(G), 

H^{G) = W^^^{G),W^^P{dG) for k e {0,1,2,...} and p e [l,oo] (see e.g. 
[6]). By ||.||/c,p,G and ||.|l/c,p,aG we denote the standard norms in W^^p{G) 
and W^'P{dG)^ respectively. Then ||.||o,p,G and ||.||o,p,aG rnean, of course, the 
norms in Lp{G) and LP{dG). The symbols |•|/e,p,G and \.\k,p,dG stand for the 
seminorms in W^'P{G) and W^^P{dG). The space {H^{G)Y is the dual space 
to H^{G) and (. , .) is the duality pairing between (iJ^(G'))* and H^{G). If 
we denote by Pk{K) the space of all polynomials on K of degree less 
than or equal to k. 
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2 Continuous Problem 

Let i? C be a bounded domain. We suppose the boundary df? of Q to be 
Lipschitz-continuous, and moreover, let the boundary of i? be piecewise of the 
class C^. 

We consider the following boundary value problem: 

Find u : j? R such that 



in J7, (1) 

on dQ, (2) 

where / : — > R and (p : df2 ^ H are given continuous functions, and 

k: > 0, a > 0 are given constants. 

In a standard way we can introduce a weak formulation of the above prob- 
lem: 

A function u : i? ^ R is said to be a weak solution of problem (1) - (2), if 

a) ueH\f2), (3) 

b) a{u,v)^L{v) yveH\f2), 

The forms a and L from (3,b) are defined for u^v e by 



-A^ = / 



I irv 

— + K upu == (f 
nr). 



a{u, v) = b{u, v) + d(u^ v), 
L{v) L^{v) + L^{v), 



where 



6(u, v) 
d{u, v) 



[' 

JQ 



Wu • Vu dx, 



■ [ 

JdQ 



uvdS, 



L^{v)= [ fvdx, 
jQ 

L^{v) = [ (pvdS. 

JdQ 

It was shown in [2] that the operator A : (L7^(i?))* corresponding 

to the nonlinear form a by 



{A{u),v) = a{u^v) Vu,uGiJ^(i7) 



is uniformly monotone, Lipschitz-continuous on every bounded subset of 
and coercive; the linear form L defining the right-hand side of (3,b) is 
continuous. Hence, by the monotone operator theory, problem (3) has exactly 
one solution. 
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3 Finite Element Discretization 

We discretize the problem with the use of the finite element method. Let us 
consider a system \Qh\ , 0 < ho < 1, of polygonal approximations of 

V J /iG(0,^o) 

i7. As i? need not be convex and thus the inclusion C need not be valid, 
we suppose that i7* is a bounded domain with Lipschitz-continuous boundary 
such that i? C i? and f2h C f2 for every h G (0, ho). Let the right-hand side 
/ of the equation (1) be defined on the whole domain i7*. 

On the domains f2h we consider triangulations Th formed by a finite number 
of closed triangles T. We say that T G This a, boundary triangle^ if T has a side 
S C dQh> By Sh we denote the set of all sides S C dQh of all boundary triangles 
T GTh. We suppose that all vertices of Th are in i7, that all vertices lying on 
dOh belong to dQ too, and that every boundary triangle has exactly two 
vertices lying on df?h- Moreover, let all points from dQ where the condition 
of (7^-smootheness of dQ is not satisfied be vertices of Th. Finally, we suppose 
the intersection of dO and dOh to be formed only by sides and vertices of 
triangles from Th. 

We denote by the length of the maximum side of T G 7^ . Let us set 

h = max hr. 

TeTh 

We assume the index h to be chosen in such a way that h = h. 

In what follows we denote by |T| the area of a triangle T £ Th and by 1*7 1 
the length of a side S £ Sh. 

In order to be able to prove the solvability of the discrete problem and the 
convergence of the method, we assume that: 

a) The system <Th\ of triangulations is regular, which means that the 

V J /iG(0,/io) 

magnitudes of inner angles of all triangles T £Th are bounded from zero by 
a positive constant do independent of h G (0, ho). 

b) The triangulations Th,h £ (0,/iq), locally satisfy the inverse assumption at 
dQ: 

There exists u > 0 such that for every h G (0, ho), 5 G 5/j,, we have 

|* 7 | > lyh. 

Due to these two assumptions there exists a constant A > 0 such that 

h^>\T\>Xh^ 

for every boundary triangle T £ Th and every h £ (0, ho). 

An approximate solution of problem (3) will be sought in the space of linear 
triangular conforming elements Hh C H^{Oh)'> 

Hh = {vh e C{Qh)'. vh\T e Pi(T) VT g %]. 
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The forms a and L defining the weak solution are discretized in two steps. 
First we integrate in all integrals over Qh and dQh instead of over Q and 
respectively. In the second step we apply quadrature formulae to evaluate 
these integrals. We suppose the formula used for the integration over triangles 
to be exact for all constant functions and the formula used for the integration 
over sides to be exact for all linear functions and to be monotone, i.e., all its 
coefficients are positive. In such a way we come to the following approximations 
of the forms defining the weak solution {vh^Wh G Hh)'. 

ah(vh,Wh) = bh(vh, Wh) + dh(vh, Wh), 

Lh{wh) = L^{wh) + L{{wh), 

where 

t>h{vh,Wh) = / Vvh-S/Whdx, 

m 

dh{Vh,Wh) = K E i«iE fdpi{\vh\°‘vhWh){xs^^), 

S€Sh M=1 

m 

Lh{wh)= ^ \S\^(3^{>phWh){xs,^,), 

SGSh 

M 

Lf!{wH)= ^ \T\J2Mf^h){xTA 

TeTh 

Here (ph is an approximation of the function (p from the boundary condition, 
which will be introduced later. 

Let us note that in the approximation of the form b we do not use numerical 
integration, as we integrate constant functions. 

Now we can define an approximate solution of problem (3) as a function 
Uh ' f2h ^ ^ such that 

a) Uhe Hh, (4) 

b) ah{uh,Vh) = Lh{vh) yvh G Hh. 

4 Ideal Triangulation 

Since the problem is nonlinear in the boundary condition, we meet some diffi- 
culties in the analysis of the FEM. These are caused especially by the fact that 
in general the boundaries of Oh and O may not be identical, the union of all 
triangles of Th may not form i7, and that we seek the approximate solution in 
a space which may be different from the space in which we look for the weak 
solution. We handle these difficulties with the aid of Zlamal’s ideal elements 
(see [12]). 
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First we introduce the concept of the ideal triangle. Let us have a triangu- 
lation Th of the domain Qh and let T G 7^. If T is not a boundary triangle, 
we set — T and we denote its vertices by Pi, P2, P3 in any way. If T 
is a boundary triangle, then we numerate its vertices in such a way that the 
vertices P\ and P3 lie on the boundary. Then we replace in the triangle T the 

straight side S = Pi Ps C df2h by the arc Es — P1P3C dQ to get the curved 
ideal triangle We denote the set of all ideal triangles by and call it 
the ideal triangulation associated with T^. It is obvious that the union of all 
ideal triangles from forms Q. 

As the functions from the space Hh are not defined on the whole domain 
i?, we need to modify somehow these functions. We proceed in the following 
way: 

Let us consider the reference triangle K in the (^1,^2) - plane with the 
vertices Pi = (0,0), P2 = (TO) and P3 = (0, 1). We denote by the affine 
mapping which maps the triangle K one-to-one on the triangle T in such away 
that X^{Ri) = Pi for z = 1, 2, 3, and let be its inverse. 

By Zlamal [12], there exists such a mapping X, which maps one-to-one the 
reference triangle K on the ideal triangle that it as well as its inverse S are 
of the class C^. With the aid of these mappings we can now define a function 
w associated with a function w G Hh by 

w{x) = w{X^{S{x))) Vx G 

It is obvious that we can construct this function w for any function w defined 
on Qh and not only for w G Hh- 

In a similar way we can introduce a function 7 : df2 — > R associated with 
a function 7 defined on df2h- We put 

j{x) = 7(X°(^(cc))) V5 G <5^, Vx € Ts. 

Moreover, we can define for every function 7 : 91? — > R its approximation 
jh given on df2h by 

7/.(x°) = 7(X(E°(a;°))) G Sh, G 

In such a way we approximate the function (p from the boundary condition. 

Remark: It is obvious that w\t = w\t ioi every T eThH Tff. 

It was shown in [4] that 



f v{x)dS — f v(x)dS < c^h f \v{x)\dS 
J Qf2 J df2h J df2h 

for every v G L^{df2h)- Moreover, the following relation between the seminorms 
of w and w in the Sobolev spaces and is valid: 
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where 



1 <!|| <! + »/.. 



This estimates allows us to derive relations (5, a, b) between norms and 
seminorms of w and w from the following Lemma 1 . The proof of (5, d) can 
be found in the same paper [4], for the proof of (5, c) see [10]. 



Lemma 1. Let p € [l,oo). Then there exist positive constants C\ = Ci{p), 
C2 = C2{p), C 3 = Cs{p), C4 and hi G (0, ho] such that for every h G (0, hi) 



a) 


Cl ||'i^||o,p,c>r 2 


< 


Iloilo, 


< Ci\\w\\o^p^dQ 


Vw G LP{dQh), 


b) 


C 2 |'^|l,p,0l7 


< 




< C2\w\i^p^0Q 


Vw 6 w^’P{dnh), 


c) 




< 


\w\i,p,n^ 


< C 3 I w |i,p,r 2 


Vw e Hh, 


d) 


C4^\\w\\i^2,Q 


< 


\\w\\i,2,nh 


< C4I u)| I,2,i7 


Vw G Hh. 



Remark: In what follows, whenever we will refer to any result from [ 2 ] or 
[3], where the domain Q is polygonal, the polygonality of Q was not used in 
the associated proof. 



5 Existence of Approximate Solutions. 

Let / G for some q > 2 and (p G for some r > 1. By [2] 

the forms ah are under these assumptions continuous and strictly monotone. 
Moreover, it was shown in [4] that the forms Lh are continuous and that there 
exists a positive constant C 5 such that 

\L{vh) - Lh{vh)\ < C^h\\vh\\ l,2,i7h 

for every h G (0, ho), G Hh- Concerning the coercivity of the forms Oh^ we 
have the following lemma: 

Lemma 2. Let h\ and C4 be as in Lemma 1. Then there exist h2 G ( 0 , hi] and 
Ce > 0 such that 

for every h G ( 0 , h2) and vh G Hh with \\vh\\i^2,Qh ^ ^4- 

Remark: In what follows, we suppose ho to be so small that h2 = hi = ho 
in Lemmas 1 and 2. 

The coercivity of ah was obtained by the aid of the following estimate of 
the error in the approximation of the form d : 

\d{i)h,Wh) -dh{vh,Wh)\ < 

V/i e (0, ^o), '^Vh,wh e Hh, 

where C7 = C^ipr) is a positive constant, r G [ 1 , 00). 

On the basis of the above mentioned properties of the forms Lh and a/^, 
similar to those of the forms L and a, the monotone operator theory gives: 
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Theorem 1. Discrete problem (4) has for every h G (0,/iq) exactly one solu- 
tion Uh e Hh- 

(See [4], Theorem 4.1.) 



6 Convergence of the Finite Element Method 

The monotonicity of the forms ah is sufficient in the proof of the existence and 
uniqueness of the approximate solutions. However, for the derivation of the 
error estimate of the method we need somewhat stronger result. By [11], the 
forms ah are “almost uniformly monotone” in the following sense: 

ah{vh,Vh -Wh) -ah{wh,Vh - Wh) > (t{\\ Vh ~ Wh \\i,2,o) 

yvh,Wh e Hh, 

where the function 



/.X f c for 0 < t < (c*)-« 

Iort>(c-)-i 

is nonnegative and increasing in [0, oo). 

In [11] also another estimate of the error in the approximation of the form 
d was derived: 

We have (1 < p < oo, Cg^Cg = Cg{p) are positive constants): 

\d{vh,Wh) - dh{vh,Wh)\ < 

— (^^sh\\vh\\i^^f2h ^ \'^h\l,p,nh\\'^h\\o^c>o,f2h^ || 1,2,/?^ 

for all h € (0, hg) and Vh^Wh € (We set ^ = 0.) 

On the basis of the above inequalities an abstract error estimate was 
established. In what follows, we denote by u the weak solution and by 
Uh, /i G (0, /lo)? solutions of the discrete problem. 

Lemma 3. Let Q{t) = for t > 0, Q(0) = 0, Q_i he the inverse of 
Q and let 1 < p < oo. Then there exist positive constants Cio, Cn, C 12 and 
Ci 3 = Cis{p) such that 

\\u - Uh\\i,2,a < \\u - Vh\\i,2,n+ 

+Q-1 {Cioh + Cii(l 4- ||'^^||i,2,r2 + ||'^/i||i,2,i?)ll'^ “ '^h\\i,2,n+ 

+Ci2h{\\Vh\\l,2,f2h + 11'^^ 111,2,^2,,) + p 00,^2^} 



for all h G (0, hg) and Vh G Hh- 
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Finally, we can formulate the main result, the error estimate for the solution 
of the discrete problem (see [11], Theorem 4.1): 

Theorem 2. Let the solution u of the continuous problem satisfy u G 
and let Uc G be an extension of the function u on R2. Then for every 

p G (2, oo) there exist h G (0, ho] and a positive constant C = C(jp^ ||t^c||2,2,r?*) 
such that 

11'^^ — < Ch «+i 

for all h G (0, h). 

(For the existence of Uc see, e.g., [9], Theorem 3.10.) 

Remark: If there exists such a function Uc G fl VF^’°®(f?*) that 

Uc\f2 — it is possible to show that the rate of convergence of the method is 
0(h^+i ). It is an open question, whether this stronger assumption is necessary 
for improving the error estimate from Theorem 2. 
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Summary. We propose and test a fully automatic, goal-oriented hp- adaptive strat- 
egy for elliptic problems. The method combines two approaches: the standard goal- 
oriented adaptivity based on a simultaneous solution of the primal and dual problem, 
and a recently proposed automatic hp- adaptive strategy based on minimizing the 
projection-based interpolation error of a reference solution. 



1 Introduction 

Nowadays, the theory of the hp-version of the finite element method is well- 
established and founded on solid results mostly due to the efforts of Babuska 
and coworkers. However, the practical realization of fully automatic and robust 
3D /ip- adaptive algorithms still presents many serious difficulties mainly due 
to excessive programming complexity. 

We would like to introduce a novel fully automatic algorithmic approach to 
goal-oriented /ip- adaptivity for elliptic problems. The methodology does not 
rely on estimates of error or its higher derivatives, and it is capable of achieving 
exponential convergence not only in the asymptotic but also in preasymptotic 
range of error level. Due to the limited length of this paper, only basic ideas of 
the approach can be presented, but many details on both theory and computer 
implementation can be found, e.g., in [3, 4, 5, 7, 8]. 



2 Different roles of error estimation in h-, p- and 
/ip-adaptivity 

Error estimation forms an essential part of most h- and p-adaptive finite el- 
ement algorithms. Recall that /i-adaptivity is based on spatial refinement of 
elements with largest contributions to the error and that p- adaptivity achieves 
the reduction of the error by increasing the polynomial order in elements. In 
both cases, thanks to the low number of options an element can be refined 
(in most cases only one), the estimate of magnitude of error in elements is 
sufficient to guide the adaptive process. 
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The situation, however, changes with /ip- adaptivity that allows both for 
pure p-refinements and spatial refinements with suitable redistribution of the 
polynomial order to element sons. Typically one has several options to choose 
from, and the number of possibilities increases dramatically as the polynomial 
order of elements in the mesh gets higher. The situation is illustrated in Fig. 
1 . 



OR... 



Fig. 1. Several options for refinement of a quadratic triangular element. The 
numbers in the elements indicate their polynomial orders. 



It is clear that information about the magnitude of the error in elements is 
not enough to drive an automatic /ip- adaptive algorithm - in order to decide 
between a pure p-refinement and a (let us call it) genuine /ip- refinement, and 
for the selection of optimal polynomial orders in the element sons, one needs 
information about the actual shape of the error. 

There are several possible approaches to do this, all of them in some sense 
utilizing information about higher-order spatial derivatives of the error func- 
tion. In [2] the authors apply standard error estimates to the element sons, [6] 
uses duality-based estimates of higher derivatives of the error in a goal-oriented 
algorithm. We will follow the idea [4] and calculate an approximation to the 
error function by means of reference solutions. 



3 Reference solutions and approximate error function 

Consider a bounded domain i? C with a Lipschitz continuous boundary 
and a standard boundary value problem 

b{u,v) = f{y) for all v eV, (1) 

where V = V{Q) is a, Hilbert space, b a symmetric bilinear positive-definite 
elliptic form over V xV and f eV' a linear form. 

By Th,p and Uh,p we denote the coarse mesh and the coarse mesh approxi- 
mation to the exact solution u, respectively. For simplicity let us say that the 
mesh Th,p covers the domain Q exactly. 

Assume that one can use the values of the coarse mesh approximation Uh^p 
to calculate a function Uref that approximates the exact solution u essentially 
better than Uh^p itself. Then the difference 
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SWh^p — '^ref '^hjp (2) 

gives a meaningful approximation of the true error 

^h,p — U U-/|, p, (3) 

and the function Uref is called reference solution. 

There are various ways to calculate reference solutions. For example, highly 
accurate approximations based on Babuska ’s extraction formulae (postprocess- 
ing formulae applicable mainly to lower-order elements) were used to guide 
/ip- adaptivity in [5]. A robust way to obtain reference solutions for elliptic 
problems without any limitation on the polynomial order was proposed by 
Demkowicz [4] . The function Uref is defined as approximate solution on a uni- 
formly /ip-refined mesh, i.e. on a mesh where all elements are refined so that 
h ^ h/2 and p — » p -f 1. Since Uh,p already contains useful information about 
lower frequencies in the solution, the higher frequencies identifying u/i/ 2 ,p+i 
can be obtained with a reasonable amount of work using a two- grid solver. 



4 Projection-based interpolation 

This elementwise local technique, that plays an essential role in the presented 
automatic adaptive algorithm, generalizes the standard Lagrange (vertex) in- 
terpolation to higher-order finite elements by combining it with projection on 
spaces generated by hierarchic higher-order shape functions. We can confine 
ourselves to a reference domain, since this is where almost all operations in an 
/ip-adaptive code are performed. Choose, for example, a reference triangle T. 
Let p^ denote the polynomial order in the interior of T and p ^ , and p^ the 
polynomial orders related to its edges ei,e 2 and 63 , respectively. By ui,U 2 ,us 
denote the vertices of T. Consider a sufficiently regular function w defined in 

f. 

The projection-based interpolant Wh,p is constructed in three steps: First 
one calculates the vertex interpolant ^ as a linear combination of vertex 
shape functions and , such that 

K,piyj) = w{vj), i = i,...,3. ( 4 ) 

In the next step one subtracts the vertex interpolant from the original function 
w and defines a new function 

w(i) := W - wl p (5) 

that vanishes at all vertices. The edge interpolant wf^ ^ is computed in the form 
of a sum of contributions over all edges, 

<,p = J2^'p- 

k=i 



( 6 ) 
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Each function k = is a linear combination of edge shape func- 
tions (P 2 " , . . . , 5 such that 



i;(l) _ 



h,p\\Hl^,\ek) 



( 7 ) 



is minimal. The minimum is achieved if and only if the trace of the difference 
is normal to the traces of all edge functions . . . , on 

the edge in the norm iJoo^(e/c)- Hence the discrete minimization problem 
(7) translates for each edge e/c, k = 1, . . . , 3, into a system of — 1 linear 
algebraic equations. The norm HQQ^{ek) is defined using harmonic extensions 
of functions defined on edges to the element interior, and therefore is difficult to 
evaluate exactly. For practical computations one can replace it by a weighted 
i^Q-norm [4, 8 ]. 

In the last step one defines a new function 



^( 1 ) 



■ w, 



h,p 



( 8 ) 



Notice that this function generally does not vanish on edges. The bubble inter- 
polant p is obtained by projecting on a space generated by the bubble 
functions of orders p < in the i7^-seminorm. Minimization of the difference 



( 9 ) 

in the iJ^-seminorm translates, analogously as in the previous case, into (p^ — 
l)(p^ — 2)/2 linear algebraic equations. 

Finally the projection-based inter polant Wh,p is defined as the sum of the 
vertex, edge and bubble interpolants. 



Wh,p = wl^p + 



( 10 ) 



5 Adaptivity as elementwise minimization of 
approximate error 

At the beginning of each mesh adaptation step one has at his disposal the 
following information: the coarse mesh the coarse mesh solution Uh^p^ the 
uniformly refined mesh r^/ 2 ,p+i, the reference solution Uref = '^h/2,p+i 
the approximate error function errh^p = Uref — Uh,p- The question is how to 
use the function err^^p to adapt the mesh Th^p in an optimal way. 

Recall from Fig. 1 that there always are several possibilities an element 
can be hp-refined. With no further information on the exact solution u it 
is not known a-priori which hp-refinement is the optimal one. Therefore a 
possible strategy could be to parse through all element refinement options, 
always creating a new mesh computing a new approximate solution u"^ p 
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and selecting the element refinement option that maximizes the value of error 
drop Aerrh,p, 



Aerrh^p = \\errh,p\\e,f2 ~ \\errl^p\\e,^^ ( 11 ) 

— Il'^re/ '^h,p\\e,n ll^re/ '^h,p\\e,0 
N 

— ^ ^ ll'^re/ ~ '^h,p\\e,Ki ~ H'^re/ ~ '^h,p\\e,Ki‘ 
i=l 

Here ||.||e,r2 means the standard energy norm 

\Mln = Kw,w) (12) 

that is defined for every form b satisfying assumptions listed at the beginning 
of Section 3. 

Unfortunately, Aerr^^p is a global quantity and (11) cannot be maximized 
elementwise. Obviously one cannot afford to solve the global discrete problem 
for each element and all its hp-refinement options. 

However, at the cost of introducing an asymptotically negligible error one 
can replace the coarse mesh solution Uh,p in (11) with the coarse mesh inter- 
polant of the reference solution Uh^pUref and at the same time the function 
u*^ p by the interpolant U^^Uref of the reference solution to the mesh 
This transforms (11) to 

N 

AERRh,p — ^ ^ ll^re/ nh,p'^ref\\e,Ki H'^re/ ^h,p^ref\\e,Ki‘ (1^) 

This is a step of crucial importance. Thanks to locality of the projection-based 
interpolation operators described in Section 4, the approximate error ERRh^p 
can be minimized elementwise. In other words, one does not have to solve 
global discrete problems for all investigated element refinement options. For 
each element Ki^ i = 1, 2, . . . , W, the global problem related to maximization 
of Aerr/i^p(i7) is replaced with a local problem of maximizing AERRh,p{Ki)^ 
where 



AERRh^p{Ki^ — II e,Xi II^Te/ ^h^p^'f'e.f^e^Ki' (1^) 

In each mesh optimization step the maximum of interpolation error decrease 
rates over all elements is calculated and only elements with rates exceeding, 
e.g., 1/3 of the maximum, are selected for refinement. The implementation, 
however, involves many important details that exceed the scope of this paper 
- we refer, e.g., to [4, 7]. 
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6 One-dimensional illustration 

The algorithm can be best illustrated in one spatial dimension. Competitive 
refinements have a very natural structure here: Adding one DOF to a linear 
element (p = 1) can be done either as a p — > p + 1 refinement or as a /i — ^ /i/2 
refinement with p^ — = 1. Adding one DOF to a quadratic element can 

be done either as a p ^ p + 1 refinement or as a ^ h/2 refinement with 
Pl = 1, Pi? ==2 or pl == l,pi? = 2. Adding one DOF to a cubic element results 
into four analogous options and so on. 

Consider, e.g., the Poisson problem —u" = f in the interval i? = (0,7 t) 
with homogeneous Dirichlet boundary conditions. The function / is chosen in 
such a way that the function u{x) = k{l — x)^ sin(nx), /c — 2, m = 2, n = 5, 
depicted in Fig. 2, is the exact solution. 




Fig. 2. The exact solution u 



The following results were obtained by means of a one-dimensional au- 
tomatic goal-oriented hp-adaptive C++ code MESHOPT^. One starts from 
an equidistant mesh consisting of three quadratic elements. The convergence 
curve in -seminorm is shown in Fig. 3. 

Figs. 4 - 7 visualize a few first steps of the automatic /ip- adaptive algorithm. 



7 Incorporation of goal-oriented adaptivity 

In comparison with the adaptivity in energy norm which attempts to minimize 
the energy of the residual of the approximate solution, the goal-oriented ap- 
proach attempts to control concrete features of the solved problem ( quantities 
of interest ) . Very often quantities of interest can be represented as bounded lin- 
ear functionals of the solution, see basic literature on goal-oriented adaptivity, 
or, for a review, [7, 8]. 

^ MESHOPT can be downloaded free of charge at the web page of the second author 
http : //www . caam . rice . edu/~solin. 
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Fig. 3. Convergence curve, x-axis: number of DOF, p-axis: \u—Uh,p\1ji(^Q in decimal 
logarithmic scale 
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Fig. 4. Left: Uh,p (solid line) and Uref = Uh/ 2 ,p-\-i (dashed line). Right: projection 
error decrease rates. Only element Ks exceeds 1/3 of maximum and is selected for 
refinement. Largest decrease of interpolation error on Ks is achieved by means of its 
p- refinement. 



7.1 Dual problem and estimate of error in goal 

Let us recall the basic ideas leading to the formulation of the dual problem, 
since they will be used to incorporate the goal-oriented adaptivity into the 
energy-driven hp- adaptive strategy discussed in Section 5. 

Consider problem (1) and its discrete version b{uh^p,Vh,p) = f{'^h,p) for all 
'^h,p ^ where Vh^p C F is a polynomial finite element approximation of 
space V. Define the error eh,p = u — Uh^p and consider the residual rh,p{vh,p) = 
f{yh,p) — b{uh,p^Vh^p)- Relate the residual r^^p to the error in the quantity of 
interest, i.e. find G G V" such that G[rh^p) = L{eh,p)- By refiexivity, G can be 
related to an element v in the original space {influence function), 
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Fig. 5. Left: Uh^p (solid line) and Wh,/ 2 ,p+i (dashed line). Right: projection error 
decrease rates. This time, all elements are selected for refinement. Automatically 
selected best combination is p- refinement of and K 2 and hp- refinement with 

PL = PR = 2 for K 3 . 





Fig. 6. Left: Uh,p (solid line) and Uh/ 2 ,p+i (dashed line). Right: projection error 
decrease rates. Elements Ki and K 4 will be p- refined. 
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Fig. 7. Left: Uh,p (solid line) and (dashed line). Right: projection error 

decrease rates. Element Ka will be p-refined. In the following step (not depicted), 
element K 2 will be selected for hp-refinement with pL = PR = 2. And so on... 



G{rh,p) = rh,p{v) = f{v) - b{uh,p,v) = b{u,v) - b{uh,p,v) 

= b{eh,p,v) = L{eh,p) (15) 

' V " 

where v is the solution to the dual problem: Find v eV such that 

h(u^v) = L{u) (16) 

for all u E V. Consider the discrete dual problem b{u^Vh,p) — L{u) for all 
^ Estimate the error in the quantity of interest by means of the errors 
in energy norms for both the primal and dual problem: 

|Z/('u) L/(Ufi^pj\ — \L{u Uh,p)\ ~ 

~ Uh^p^V Vh,p)\ ^ ^ ^ ^ ll'^ |e,X 1 1”^ I lejK”* 

^ ^ 'b'h , p 

Standard orthogonality property for the error in the solution was used. 

7.2 Elementwise minimization of approximate error in goal 

Recall that the energy-driven hp-adaptive algorithm from Section 5 mini- 
mizes the global error in energy norm by elementwise maximizing the drop 
of projection-based interpolation error (13) of the reference solution Uref from 
the coarse mesh r^^p to the next optimal mesh 
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The estimate (17) shows that the error \L{u) — L{uh^p)\ in goal is con- 
trolled by errors of both the primal and dual solutions in the energy norm. 
Therefore, an hp-adaptive algorithm will minimize the error in goal if instead 
of elementwise maximizing (13) it will elementwise maximize the product 

AERRl°;\Ki) = /\ERR"\Ki)/\ERRt^^\Ki) (18) 

where 

= \\Uref - nh,pUref\\e,K, ~ \\Uref ~ n^pUrefWcK, (19) 

and 

AERR^^p\Ki) = \\Vref - nh,pVref\\e,Ki ~ W^ref ~ || e,i^i • (20) 

Here Vref = '^h/ 2 ,p-\-i is the reference solution to the dual problem, calculated 
on the uniformly hp-iefmed mesh r/i/ 2 ,p+i- 



8 Reference to numerical examples 

Performance of the presented goal-oriented /ip-adaptive strategy is quite im- 
pressive in comparison with the same algorithm applied to standard goal- 
oriented adaptivity or to standard (energy-driven) /ip- adaptivity. However, 
there is little space here for a sufficient presentation of a realistic model prob- 
lem and meaningful discussion of numerical results. We refer the reader to a 
recent book [8] together with the paper [7] that uses the presented automatic 
goal-oriented hp- adaptive strategy to resolve a challenging industrial applica- 
tion related to axisymmetric Maxwell’s equations. 
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Summary. The collocation boundary element method for the Dirichlet boundary 
value problem is considered. In order to solve efficiently the resulting linear systems 
for several wave numbers, the adaptive cross approximation (AC A) method is applied 
to the matrices. In particular, the algorithm is reformulated for complex problems 
and the so-called Fourier method is used to compute the necessary entries. Finally, 
some numerical examples for the solution are presented. 



1 Introduction 

The Helmholtz equation 

(A + = 0 , n — — G , 

^ ^ c 

arises in many physical problems related to wave propagation. In acoustic 
applications, to and c are the frequency and the speed of the sound, and u 
corresponds to the pressure field. We are interested in the solutions of the 
associated exterior Dirichlet boundary value problem (BVP) 

A u{y) + 0 , x \ft, 

u{y) = 9{y), xgF, qx 

/ d \ ^ 

o ( l^r^) for large \y\ = r . 

for a spectrum of real wave numbers 0 < /^ < Krnax , where himax is correspond- 
ing to the highest frequency. In (1), F = dQ denotes the smooth boundary 
of the bounded, simply connected domain and p is a given function. Using 
Boundary element methods (BEM) to treat these problems, we need to solve 
a large linear system for each wave number. The memory requirement for each 
problem is Mem = 0{N‘^) and a naive procedure for the matrix- vector mul- 
tiplications (using an iterative solver) is given by Op = 0{M A/"^), where M 
denotes the number of frequencies, and N is the number of degrees of freedom 
by BEM discretisation. Typical values are N = 10^ — 10'^ for the dimension of 
the problem and M — 10 — 10^ for the wave numbers of interest. 
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Whereas we introduced in [10] and [11] a numerical method, for computing the 
associated matrices, which is based on the Fourier transform with respect to 
the wave number /^, we now discuss an efficient method for solving the result- 
ing linear systems. In particular, we apply the Adaptive Cross Approximation 
(AC A) method (see e.g. [1], [2]) to the transformed matrices and discuss the 
behaviour of the compression factors in dependence on the wave number. 

The paper is organised as follows. 

In Sect. 2, we consider the boundary integral formulation for the problem and 
its discrete form. A review of the Fourier method for computing the matrices 
is presented in Sect. 3. In Sect. 4, we reformulate the AC A algorithm for the 
complex problem. Finally, we give some numerical results (Sect. 5). 



2 Boundary Integral Formulation and Collocation 
Method 



We define the combined single- and double-layer potential 

(B-irjA)[f]= j -ir]G{x,y,K)f{x)dF:, , y€U^\T, 

r 

where 77 G IR"^, and G{x,y^K,) is the fundamental solution of the Helmholtz 
equation defined by 



1 

G{x,y,n) = — , x,y eU . 

471 \x — y\ 

Note that the fundamental solution and thus the above potential satisfy the 
Sommerfeld radiation condition. 

Using this potential to treat the Dirichlet BVP (1), we need to solve the 
boundary integral equation (BIE) 

+ B - ir)J^ [f]{y) = g{y) , y &T . (2) 



For all wave numbers /^, the following uniqueness theorem holds. 

Theorem 1. The exterior Dirichlet BVP (1) has a unique solution for all 
g e 5 G IR, 



u{y) 



/[ 



d G{x,y,K) 
drix 



-ir]G{x,y,K) 



f{x)dF, 






( 3 ) 



In (3), f G H^{dCl) denotes the unique solution of the BIE (2). 
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For more details on the analysis of the Helmholtz equation, we refer the reader 
to [3], [5]. 

In order to solve the equation (2) numerically, the surface F is discretised using 
a system of N plane, triangle panels T ^ Tk = U^i natural to use 

the approximate function fh in the following form / ^ fh{x) = Ylf=i 
where the associated ansatz functions j = 1, . . . , A/^, are piecewise constant 
on Tj. 

Therefore, the BIE above leads to 

(^I + B{K)-ir]A{K)^v = g, (4) 

where the elements of the matrices A, B are defined by 

I r I r 

aij{K) = ^ ^ / 'TF - 1) < rix.x - yi > dFx. 

r,- r,- 

where r := \x — yi\. The vector V g(D^ and the right-hand side of the systems 
are given by {v)j — Vj and {g)i = g{yi)^ where yi, i = 1, . . . , A^, denote the 
corresponding collocation points. It should be remarked that the matrices in 
(4) explicitly depend on the wave number tz. 



3 Fourier-Method 

In order to compute the matrices in Eq.(4) for several wave numbers, we first 
apply the inverse Fourier transformation {k, to the matrices. 

The elements of the inverse Fourier transformed Matrix (0 ^ 

^NxN given by 



^ij (0 



1 1 
471^ 



j S{^-r)dF^ 



i,j = 



(5) 



as a consequence of J- ^ ^ = F ^ ^ [1](^ — r) = S{^ — r). In the case of 

the double-layer potential matrix, we remark 



F-\ (i«r - 1)](0 = - l) 5{z) 

and obtain B{i) € with 



z=^—r 



hi 






z=r—^ 



{nx,x-yi)AFx, i,j = l,...,N.{6) 
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Since we consider plane triangle elements Tj , the analytical computation of the 
entries (5) and (6) is based on an appropriate transform of the coordinates. 
For more details on the analytical computation, we refer the reader to [11]. 
According to this paper, we restrict our further discussion to the special case 
printed in Fig. 1 




Fig. 1. The computation of the entries of the matrices 



and obtain the following analytical expressions: 

“u(0 = . where a^- (0 ^ - arccos , (7) 

and 

" V " 

Notice that the diagonal elements of B(^) are zero. 

In (7) and (8), irnax denotes the maximal distance of the vertex of the triangle 
to the origin, and ^min is defined by the minimum of and the 

minimal distance of the vertex of the triangle to the origin. Further, ^ is the 
maximal angle between the triangle and the ei-axis. 

The expression above implies that each element has a local support. Therefore 
the matrices A{^) and have sparse structures for a fixed ^ G [ 0 , diam(F) ]. 
It should be remarked that OLij{^) and are integrable functions with anti- 

derivatives and Furthermore, becomes singular at the point 

^ = 6 . 

We return to matrices which depend on the wave number by applying the 
Fourier transformation i.e. C{n) ■= PKOK^) with C = A and 
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C = B respectively. Using the expressions (7) and (8), we obtain new matrices 
A{k),B{k) where its elements are defined by 






^ Cmin , ^ ^ 



(9) 



1 = 1 



and bii{K.) = 0, 



~ ^ • I j| / ^max ^ ^ ^ \ 

bij{K) = —sgn{d)e'‘^^ - ^ ^ (^0sinc«i)e‘'"^') • (10) 

In particular, we need to solve the linear system 



-I-hB(K) -[tjA{k) ) V = ^. 



( 11 ) 



Due to the independence of the wave number the antiderivative crij{^) and 
the approximation function iii (9) and, similarly, 

(10) need to be calculated only once. Thus we will treat the respective linear 
systems (11) for several k < K,max using always these computed data. 



4 Application of the ACA 



In order to efficiently solve the above linear system, we consider the combi- 
nation of the described method with an approximation method. In particular, 
such numerical methods provide an approximation of the solution in almost 
linear complexity by solving a perturbed linear system, in which the matrix 
is compressed. Schemes such as fast multipole [4], panel clustering [9], and 
7Y-matrices [7], [8] are based on explicitly given kernel approximations by de- 
generate kernels, i.e. a finite sum of separable functions. In contrast, the ACA 
algorithm [1], which uses the ?^-matrix format, is purely algebraic and relies on 
a small part of the system matrix for its blockwise approximation by low-rank 
matrices. 

Theoretical and numerical aspects for the ACA of collocation matrices that 
contain asymptotically smooth kernels are discussed in detail in [2] . It should 
be pointed out that the devised algorithms can formally be applied to matrices 
for which the kernel is not asymptotically smooth but can be approximated 
by a degenerated function, e.g. the kernel of the Helmholtz operator. The 
following estimate is shown in [6]. 

Lemma 1. Let x, y G IR^ and v ly|/|a^| < Vo < 1. It holds 



gi«|x-y| 

\x~y\ 



Lp{x,y) 



< c- 



K.\x\ 



-pV, 




A Compression Method for the Helmholtz Equation 791 



where 



Lp{x, 

and c only depends on uq. In (12), jn denote the spherical Bessel functions, hn^ 
the spherical Hankel functions and Pn correspond to the Legendre polynomials. 

According to the paper [2] , we outline the method for the Helmholtz equation, 
i.e. we reformulate the so-called partially pivoted ACA for the collocation 
matrices of the Helmholtz equation. 

Let C G be a given block. The method produces vectors ui G , 

vi G(D^ , I = 1, ... from which the approximant Sk can be formed 

k 

■Sfe = ^ Uiv* 
i=i 

and it holds, C = Sk -\- particular, we obtain the following algorithm. 

Algorithm (partially pivoted ACA) 

Let 5o = 0 and Rq = C. 

Z denotes the index-set of the computed rows of Rk^ k > 0. Let Z 0 and 

ii := 1. 

For /c == 0, 1, . . . compute 

1. k := k 

2. 'klf ik > m', then exit. 

Else 

k-l 

3. Z ZU {zfc} and Vk = ^ {n)i^vi 

1=1 

4. Find jk, so that {vk)jk = inaxj \{vk)j\ 

5. If {vk)jk — Oj ^k := ik -\-f and go to ★ 

k-i 

6. Vk = {vk)jJ'Vk and Uk = Cej^ - ^ 

1=1 

7. Sk = Sk-i + Vkvl 

8. Find ik-^i, so that {uk)ik+i = max^^^ \ {uk)i\ 
until the stopping criterion is fulfilled. 

Since the block C will not be generated completely, the Frobenius norm of the 
approximant Sk will be used to obtain the stopping criterion. An appropriate 
stopping criterion is to terminate the iteration, if for a given ^ > 0 at step k 
it holds that 
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||Mfc||F||wA:||F < ^ll-S'fcllF , (12) 

where the value of ||5/e||i? can be recursively computed as follows: 

k-i 

~ + 2 U^^UiViVk + • 

1=1 

Note, that we need 0((m' H-n')/c^) operations to generate the approximant Sk 
and its memory requirement is given by 0{{m' + n')k). 

5 Numerical Experiments 





(II) (with diam(r) == 0.8m) 



Fig. 2. The surfaces 



We first consider the time-harmonic acoustic scattering of a given incoming 
plane wave 

= ( 0 , 0 , 1 ) 

by a soft-sound domain ft. Then the total acoustic wave takes the form u = u^-{- 
, where denotes the scattered wave satisfying the Sommerfeld radiation 
conditions. Further, the homogeneous Dirichlet boundary condition for u holds, 
g = —uf In particular, we present numerical experiments for the BIE (4). 
Since we chose the surface of the unit sphere, see Fig. 2 (I), the solution f{y) 
is known. 



/(j/) = £i”(2n + l)- 






71=0 



iK(Kj^(K) - ir)jniK))h^n\K.) 



Pniiyd)). 



The solution, which arises from the Fourier method, is denoted by fpT- 
We apply the algorithm to a family of surfaces converging to the unit sphere. 
The sequence is generated by recursive refinement of the meshes dividing each 
of the surface triangle in four and projecting the new knots to the unit sphere, 
cf. [2]. 
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For the approximation of the blocks we used the above algorithm and £ in (12) 
is chosen 10~^, while the relative accuracy of un-preconditioned GMRES was 
10 -^ 

We always consider the compression factors (CF), i.e. the ratios of the amount 
of storage needed when using the approximant and the amount of storage for 
the original matrix. 

In the tables below, the factors are printed for different AT, the first one for 
a fixed wave number k = n/ 2 (Table 1) and the second one for various wave 
numbers with k • h ^ 0.36 (Table 2). 



Table 1. Compression factors for k = 7i/2 



N 


h 


CF (%) i|/-/FT|k, 11/ 


- /FTII 1 L 2 / 11 / 111 L 2 


80 


0.97 


100 


3.91E-01 


6.14E-02 


5 


320 


0.51 


93 


1.07E-01 


1.62E-02 


5 


1280 


0.25 


46 


2.75E-02 


4.09E-03 


5 


5120 


0.13 


18 


6.96E-03 


1.03E-03 


5 


20480 0.06 


6 


1.93E-03 


2.86E-04 


4 



Table 2 . Compression factors for various k with k • h ^ 


0.36 


N K 


CF (%) Wf-f^rU, 11/ 


- /FTII1L2/II/II1L2 


80 0.62 


100 


2.7E-01 


3.74E-02 


4 


320 1.20 


92 


1.05E-01 


1.54E-02 


5 


1280 2.28 


49 


2.65E-02 


4.25E-03 


5 


5120 4.60 


22 


1.16E-02 


2.24E-03 


6 


20480 9.24 


8 


1.93E-02 


3.87E-03 


9 



Figure 3 shows the behaviour of the compression factors in dependence on the 
wave number for different N. 

In the graph, the curves correspond top down to the cases N = 80, N = 320, 
N = 1280 and N = 5120. We decide that the factors increase with a larger 
wave number. 

Similarly, for the surface printed in Fig. 2 (II), we consider the problem 

A[f]{y)=(-\l + Byw]{y) (13) 

for the exterior Dirichlet BVP for the Helmholtz equation using collocation 
with piecewise constant ansatz functions. Since we chose w = G{x,yo, k) with 
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Approximation % 




kappa 



Fig. 3. Compression factors in dependence on the wave number 



X G T and yo G Cl ^ the solution of the equation (13) is known to he f = 
dn^G{x,yo, /^)|rcGr- Table 3 shows the behaviour of the compression factors of 
the single- and double-layer potential matrices for various wave numbers with 
K - h ^ 0.5 for different N. 



Table 3. Compression factors for various n with k • h 0.5 



N K 


CF{SL) (%) CF{DL) {%) H/-/fT|k2 II/- 


/Frlka/ll/lliLj 




1356 11 


50 


64 


6.52E-02 


1.08E-01 


38 


5424 23 


23 


28 


9.34E-02 


1.15E-01 


66 


21696 46 


9 


10 


3.18E-01 


2.27E-01 


150 
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Summary. This paper is concerned with modelling of fluid-structure interaction. 
We consider two-dimensional viscous incompressible flow past a moving airfoil, which 
is considered as a solid body with two degrees of freedom, allowing vertical and tor- 
sional oscillations of the airfoil. The fluid flow is simulated by the Navier-Stokes 
equations in the Arbitrary Lagrangian-Eulerian formulation, discretized by the finite 
element method. We describe the SUPG stabilization of the FEM, time discretiza- 
tion, equations describing the motion of the airfoil and the solution of the discrete 
problem. The solution of a test problem is presented. 



1 Formulation of a flow problem in a moving domain 

We analyze numerically two-dimensional viscous incompressible flow past a 
moving airfoil, which is considered as a solid body with two degrees of free- 
dom, allowing vertical and torsional oscillations of the airfoil. The study of 
this problem plays an important role in the design of aerospace vehicles. The 
aero-elastic stability of aerospace vehicles and the aero-elastic responses rep- 
resented by dynamic load prediction and vibration levels in wings, tails and 
other aerodynamic surfaces have a great impact on the design as well as in the 
cost and operational safety. 

We assume that (0, T) with T > 0 is a time interval and by Qt we denote 
a computational domain occupied by the fluid at time t. By u == u(x, t) and 
p = p{x^t)^ X e t E (0,T), we denote the velocity and the kinematic 
pressure (i.e, dynamic pressure divided by the density of the fluid), respectively, 
and z/ will denote the kinematic viscosity. 
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Fig. 1. The Lagrangian(left) and Arbitrary-Lagrangian-Eulerian mapping(right) 

In order to simulate flow in a moving domain, we employ Arbitrary 
Eulerian-Lagrangian (ALE) method. Let us denote by i?ref the computational 
domain at chosen flxed time - reference or original configuration. (We can set, 
e.g. i?ref = ^ 0 -) A one-to-one mapping of the reference configuration onto the 
computational domain Qt at time t - current configuration - is denoted by 
At, i.e. 

At i I^ref ^ (1) 

X^x{XA)=^At{X). 

Based on this mapping we can compute the domain velocity w at all points 
X of the reference configuration i?ref for each time level: 

w{X,t) = (2) 
which can be transformed to the space coordinates x by the relation 

w = woAt^, i.e. w{x,t) = ^{A'^^{x),t) (3) 

j-)A 

With the aid of ALE mapping we compute the so-called ALE derivative 
which is anologous to the material derivative in the Lagrangian approach. For 
a function / : x (0, T) — > -R, we set 

^f{x,t) = ^{X,t), X = A:^\x), (4) 



where / = f o At- We find that 
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Now we reformulate the Navier-Stokes equations in the ALE form 

DA 

-^u 4- [(u - w) • V] u 4- Vp - uAu = 0 in i?t, (6) 

V • u = 0 in i?t. (7) 

This system is equipped with the initial condition 

u(x,0)==uo, X G i?o, (8) 

and boundary conditions. We assume that df2t = /DU/oUA^t, where To 
and Fwt are mutually disjoint. On Ed, representing the inlet and, possibly, 
impermeable fixed walls, we prescribe the Dirichlet boundary condition 

u\ro = UD- (9) 

We denote by Fwt the boundary of the airfoil at time t. (See Fig. 1 showing 
schematically the difference between the Lagrangian and Arbitrary Lagrangian- 
Eulerian mapping.) On Fwt we assume that the fiuid velocity u equals the 
velocity Uj- of the profile: 



u|rvv, =ur =w|rw*- (10) 

The part Fq of the boundary represents the outlet, where we prescribe the 
’’do-nothing” boundary condition 

-{p - Pref)n + = 0 on To, (11) 

where n is the unit outer normal to df2t and pref is a prescribed reference 
outlet pressure. 



2 Discretization 

There is a number of possibilities how to carry out the space-time discretization 
([5], [10]). In order to develop a stable, accurate scheme, which can easily 
treat complicated boundaries, we apply the finite element method (FEM). By 
Re = UL/v we define the Reynolds number. Here U is a, reference velocity 
(usually the far field velocity) and L is the length of the airfoil. The relevent 
Reynolds numbers in our applications are quite large, namely between 10^ and 
10^. (For such regimes the fiow is usually turbulent, but we simulate the fiow 
with the aid of the classical Navier-Stokes equations without any turbulence 
model.) In order to obtain a physically acceptable numerical solution, it is not 
possible to use a standard Galerkin FEM, but we have to introduce a suitable 
stabilization. Here we apply the streamline diffusion method (also called SUPG 
method) together with grad-div stabilization of pressure, following [6], [9j. 




Application of a Stabilized FEM to Problems of Aeroelasticity 799 




2.1 Time discretization 



First let us describe the time discretization of the problem. We consider a par- 
tition 0 = ^0 < ^1 < ’ • • < F, tk = kr, with a time step r > 0, of the time 
interval [0,T] and approximate the solution u(^n) (defined in at the time 
instant tn by u’^. For the time discretization we use a second-order two-step 
scheme using the computed approximate solution in and in 

for the calculation of in the domain With a given ALE mapping 

At we have 

= x^-\ At^{X) = x^, At^^AX) = x^+\ ( 12 ) 

where X G i?ref is a given point from the reference configuration, e.g. a node 
of the triangulation. (See Fig. 2.) 

Now we define the approximation of the ALE derivative at time tn+i and 
point by 

D-^u 3u"+i(X)-4u”(X) + u"-i(X) 

— [X 

3u"+ 1(2;"+1) _ 4u"(a;”) + 

_ _ . 

and obtain the problem for the unknown functions ^ and 



3„n+i(^n+i) _ 4u»(x") + 

_ 

+ ((u"+i(a;”+i) - w”+i(a:”+i)) • v)u”+^(x"+^) 
+ Vp”+^(a;”+i) = 0, 
divu"+i(a:"+i) =0, 
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where w(tn+i)- This problem is equipped with the boundary condi- 
tions (9) - (11) on Taking into account that {x'^)) G 

we can transform equations (14) completely to the domain 

3U^+1 _4un_^^n-l 

2r 

+ ((u^+^ - w^+i) • V) =0 in , 

divu^+^-0 in 

where == This system is again equipped with the boundary 

conditions (9) - (11). 

2.2 Space discretization 

In what follows, we shall carry out the space discretization of the problem 
to find approximations of functions u := and p = defined in the 

domain satisfying system (15) and boundary conditions (9) - (11). To 

this end, we reformulate this problem in a weak sense. Let us set i? == 
and define the velocity spaces W = (iI^(l7))^,X = {v G W';v|r^nrwt — 0} 
and the pressure space M = Lq{Q) — {q e L^(i7); f^qdx = 0}. Then it is 
easy to find that the solution U = (u,p) of problem (15) satisfies 

a{U, U, V) = f{V), V y = (v, g) G (X, M). (16) 



Here 



a{U*, U,V) = ^ (u, v) + 1 / (Vu, Vv) + (((u* - w”+i) ■ V) u, v) 

-(p,V-v) + (V-u, 9 ), (17) 

f{V) = ^ (4u” - u”-\ v) - f PrefV -udS, 

I/^(u,p), V = {v,q), t/* = (u*,p), 

where (•, •) denotes the scalar product in L^(i?). Moreover, we require that u 
satisfies the Dirichlet boundary conditions (9), (10). The couple (u,p) repre- 
sents the solution on the time level tn+i, i*e. := u and := p. 

In order to apply the Galerkin FEM, we shall restrict the weak formulation 
from the spaces IT, X, M to approximate spaces Wh, Xh, Mh, /i G (0, ho), > 
0,X/^ = {v^ G Wh;Vh\rDnrwt = 0}* Hence, we want to find Uh = {^^h.Vh) G 
IT^ X Mh such that Uh satisfies approximately conditions (9), (10) and 

a{Uh^ Uh^ Vh) = /(I4), \/Vh = (v^, qn) G x Mh. (18) 

The couple (X/i, Mh) of the finite element spaces should satisfy the Babuska- 
Brezzi (BB) condition, which guarantees the stability of the scheme: there 
exists a constant c > 0 such that 
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sup >c\\p\\L^n), ^peMh, /ie(0,/io). (19) 

We proceed in the following way. Assuming that Q is polygonal, by Th 
we denote a triangulation of Q with standard properties from the FEM. The 
pressure space M is then approximated by the space of piecewise polynomial 
functions of degree < k: 

P^PheMh = {qeMn 0(0); q\k e P^{K),\/K e %} (20) 

and the velocity space W and X are approximated by the spaces of piecewise 
polynomial functions of degree < A: + 1 : 

u^Uh^Wh = {veWn (C(?2))' ; v\k G {P>^+\K))\vK e %} (21) 

Xh = WhO W. 

This couple [Xh.Mh) satisfies the BB condition (see, [13]). 

In practical computations we use the Taylor- Hood elements. 



3 Stabilization of the FEM 

The standard Galerkin discretization (18) may produce approximate so- 
lutions suffering from spurious oscillations for high Reynolds numbers. In 
order to avoid this drawback, we apply the stabilization via streamline- 
diffusion/Petrov-Galerkin technique (see, e.g., [10], [9], [6]). We define the 
stabilization terms 

U,v)= ^ 5k ( Au - z/Au + (w • V) u + Vp, (w ■ V)v) 

KGl'h 

^h{V)= ^ 5^,(^(4u"-u”-i),(w-V)v)^, (22) 

U = {u,p), V = {w,q), U* = {n\p), 

where the function w stands for the transport velocity w = u* — 

denotes the scalar product in LP‘{K) and Sk ^ are suitable parameters. 

Moreover, we introduce the pressure stabilization terms 

Vh{U,V)= Y, tk{V -w)k, C1=(u,p), V = {v,q), (23) 

KgTh 

with suitable parameters > 0. 

The stabilized discrete problem reads: Find Uh = (uh,Ph) ^ Wh x Mh such 
that Uh satisfies approximately conditions (9), (10) and 

a{Uh, Uh, Vh) + Ch{Uh, Uh, Vh) + Vh{Uh, Vh) = f{Vh) + Ph{Vh), (24) 
\/Vh eXhX Mh. 
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The parameter 6k is defined on the basis of the transport velocity w as 









2||w||ioc(;f) 



men, 



where 






\K) 



2u 



( 25 ) 



(26) 



is the local Reynolds number and Hk is the size of the element K measured 
in the direction of w. The factor ^(-) is a monotonically increasing function 
of Re^ such that for local advection dominance {Re^ > 1) ^ ^ 1 and for 
local diffusion dominance {Re^ < 1) ^ ^ 0. The parameter (5* G (0, 1] is an 
additional free parameter. We set, e.g. 



^(i?e-)=min(— ,l). (27) 

The choice of the parameters tk is again different in diffusion and convection 
dominated regions. We put 

tk ^ T*/ii^||w||x.oo(^) and tk = ^ (28) 

for local advection dominance and local diffusion dominance, respectively, 
where r* G (0,1]. For theoretical analysis of such a choice we refer to [6], 
[9]. 

The nonlinear problem (24) is (on each time level) solved iteratively. Start- 
ing from an initial approximation and assuming that already iterate 
has been computed, we define G Wh x Mh by 

a{U\^\ul^+^\VH) + (29) 

+Vh{U\^^'-\Vh) = /( 14 ) + Th{Vh), 

Wh £XhX Mh. 



For each time level tn+i we set 

C/f^=(2u”-u”-\p”). (30) 

As numerical experiments show, only a few iterations (29) have to be computed 
on each time level. 

Obviously, problem (29) is linear. It is equivalent to the linear algebraic 
system 

Su + 2tBp = f, B'^u = 0, (31) 

where u G R^^ and p G R'^^ are vectors whose components represent degrees 
of freedom defining the velocity u and the pressure p, respectively, 5 is a non- 
singular rih X rih matrix and B is an rih x matrix. The solution of this 
system was realized by the direct solver UMFPACK ([1], [2]), which works 
sufficiently fast for systems with up to 10^ equations. For larger systems the 
domain decomposition approach or algebraic multigrid ([11], [12]) will be used. 
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4 Description of the airfoil motion 

The airfoil can oscillate in the vertical direction and in the angular direction 
around the so-called elastic axis. This vertical and torsional motion is described 
by the linearized system of ordinary differential equations (see [8], [3]) 

ttiH -f- kfjH -(- =■ — 

Sq^H -f- “h ko^cx = Af, 

where the following notation is used: H - vertical displacement (oriented down- 
ward), a - angle of rotation around the elastic axis, kn ~ displacement stiffness, 
Sa - static moment round the elastic axis, I a - inertia moment, k^ - torsional 
stiffness. 

The force F acting in the vertical direction and the torsional moment M 
are defined by 



F = - f '^T2jrijdS, (33) 

J Pwt j=l 




= -(X2- XT2), =Xi- XTl, 



n = (ni, ri 2 ) is the unit outer normal to df?t on Fwt (pointing into the airfoil) 
and XT = (^Ti 7 ^T 2 ) is the given position of the elastic axis (lying in the 
interior of the airfoil) and p is the fluid density. 

System (32) is tranformed to a first-order ODE system and then solved by 
the second-order Runge-Kutta method. We proceed in such a way that the 
computed approximate solution Uh of (24) on time levels tn and tn-i and the 
corresponding force F and moment M are extrapolated and used for obtaining 
H and a at tn+i- This allows us to determine the mapping the domain 

and approximate the domain velocity Then we pass to (24) on 

the next time level tn+i- 

5 Numerical results 

We have performed a number of numerical simulations. In this paper we present 
the results obtained for the profile NACA 632 — 415. The length of the profile 
is 0.3 m, far field velocity is 20 ms~^ and the air kinematic viscosity give 
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Reynolds number equal to 400000. Fig. 5 shows the position of the moving 
profile and velocity isolines at four time instants. We can see conspicuously 
von Karman vortices leaving the airfoil. Moreover, the oscillations of vertical 
position h of the airfoil and the angle a of rotation around the elastic axis are 
shown and values corresponding to the above airfoil positions are marked. 
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Summary. The dynamical behavior of the pulses, which is governed by the interac- 
tion between diffusion and absorption, shows the several phenomena. The remarkable 
ones are the pulse splitting phenomena which accompany pulse connecting phenom- 
ena. In this paper such phenomena are investigated from numerical points of view, 
and the mathematical justification is stated. 



1 Introduction 

We consider the propagation of thermal waves in an one-dimensional absorbing 
medium in which there is an interaction between diffusion and absorption. To 
describe such a propagation we may use the nonlinear diffusion equation with 
absorption, which is well-known as the description of the flow of the liquids 
through the homogeneous porous medium and is represented in the form of 
the initial value problem: 

Vt = {v"^)xx - X G R\ t > 0, (1.1) 

v{0^x) = xe~R}. (1.2) 

Here we have the following assumptions: 

(i) m(> 1), p{> 0), and c(> 0) are constants and m + p > 2; 

(ii) v^{x) G C^(R^) is nonnegative and has compact support. 

In a heated plasma v denotes the temperature and —cv^ describes the losses 
caused by radiation. We may take p = 0.5 for bremsstrahlung radiation and 
0.5 < p < 2 for synchrotron radiation [14]. The diffusion rate of (1.1) vanishes 
at points where v = 0. This degeneracy causes the occurrence of the finite 
propagation of the support. 

From analytical points of view, Aronson [1], Oleinik, Kalashnikov and 
Chzou Yui-Lin [13], Kalashnikov [7], [8], and Herrero and Vazquez [6] proved 
the existence and uniqueness of a weak solution and the property of the finite 
propagation of the support under the assumptions stated above. Moreover, 
v{t^x) is smooth in the open set V{v) = {(t, x)|r?(t, x) > 0 and t > 0}, and has 
the following properties: 
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(P-1) For c = 0, or c > 0 and p > 1 the diffusion is active and supp •) 
monotonously expands as t increases; 

(P-2) For c > 0 and 0 < p < 1 the absorption is active and the solution 
vanishes identically at some finite time T* > 0. 

In Case (P-1) supp f (^, •) never splits into any multiple connected components 
for t > 0, when supp v^{x) is connected. Thus the pulse splitting phenomena 
never appear. In Case (P-2) there is a possibility of the pulse splitting phe- 
nomena caused by absorption, when v^{x) has two local maxima. Rosenau 
and Kamin [14] suggested this possibility by numerical computation. Chen, 
Matano and Mimura [3] constructed the pulse splits into multiple connected 
components in a finite time. This motivates us to investigate the more detail 
of the behavior of pulses. For this end we continue numerical computation 
and find the following phenomena, where the initial pulse v^{x) has two local 
maxima and a connected compact support: 

(NS-1) The pulse splitting phenomena appear, and thereafter these two pulses 
evolve separately until one of them vanishes (see the left hand side in 

Fig. 1); 

(NS- 2) The pulse splitting phenomena never appear for t > 0 (see the right 
hand side in Fig. 1); 

(NS-3) After the pulse splitting phenomena appear, these pulses become con- 
nected, and thereafter the pulse splitting phenomena appear again (see 
Fig. 2). 




Fig. 1. Numerical support splitting and non-splitting phenomena 



When m p — 2 and 0 < p < 1 , we obtain some sufficient conditions 
under which the phenomena (NS-1) and (NS-2) appear ( [12], [15]). However, 
we are unable to answer the question mathematically whether the phenomenon 
(NS-3) is true or not. In this paper, we try to justify a part of it, which is as 
follows. 

(NS-4) The pulse splitting phenomena appear, and thereafter these pulses be- 
come connected. 
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Fig. 2. Numerical support splitting phenomena with connecting property 



We call such phenomena pulse splitting phenomena with connecting property, 
and assume the following conditions. 

Condition A. c>0,m-\-p = 2 and 0 < p < 1 hold; 

Condition B. i) v^{x) e C°(R^) is a nonnegative function with compact 
support and € L°^{R}) nBV(R}). 

ii ) ((-°( ^))m -g absolutely continuous on I = {x\v^{x) > 0} and 

ess.infj is finite. 

Our proof is based on the finite difference scheme ([10], [11], [12]), the com- 
parison theorem [2] and Kersner’s exact solution [9]. Unfortunately, in the 
case where m-hpy^2, m>l and 0 < p < 1, we are unable to find any 
exact solution and to succeed in constructing the finite difference scheme with 
convergence. This is the reason why we are concerned with the specific case 
stated in Condition A. 



2 Finite difference schemes 

We put u — and rewrite (1.1)-(1.2) as follows: 

ut = muuxx + cl{ux)‘^ - c', (4.1) 

u{0,x) — u^{x) = , (4.2) 

m 

where a = -, c' = (m — l)c and the term of absorption is written as the 

m — 1 

constant —c' by the assumption m-f-p = 2. Our difference scheme approximates 
the problem (4.1)-(4.2) instead of (1.1)-(1.2), and is described as follows: 
Find the sequence ••• C Vh for each G Vh such that 

= forn = 0,1,2,- (4.3) 

where i{u^) — r(u\) = r{u^) and i^^(xi) = u^{xi) for all i G Z, /i is a 

space mesh width and Vh is the set of the nonnegative continuous functions 
with the following properties: 
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(i) Uh has compact support; 

(ii) Uh is linear on each interval [xi^Xi^i] (i G Z), where 

{ ih for i G Z \ {L — 1, + 1}, 

£ for i — L — I, (4.4) 

r for i = R 1. 

L — L{£) = minji G Z | z/i > ^}, £ — £{uh)^ (4.5) 

R — R{r) = max{z G Z | z/i < r}, r = r{uh)> (4.6) 

Sh^k is somewhat complicated form and its detail is stated in [10], [11] and 
[12]. We omit the description of Sh,k- The variable time step k = kn-\-i = 
tn+i — tn {to = 0) determined by 

k = ^max{uL^UL^i) for the approximation to the left interface, (4.7) 
c 
or 

k = ^max{uR^Uft-i) for the approximation to the right interface. (4.8) 
c 

The left and right numerical interfaces are defined by 

= £{u^) and = ^('^/J) foi' ^ = 0, T 2, • • • , (4.9) 



respectively. When Sh^kU^ = 0 holds for some integer n* > 0, we put the nu- 
merical extinction time = tn*-\-kn*-\-i, and stop the numerical com- 

putation. We define the left(resp. right) numerical interface curves £h{t){vesp. 
'^h{t)) by piecewise-linearly interpolating (t^, ^n)(resp.(tri 5 ^n))(0 ^ ^ ^*)- 

We obtain the basic estimates which enable the proofs of the convergence 
of the numerical solutions and the convergence of the numerical interface 
curves ([10], [11], [12]). The former can be proved by Graveleau and Jamet’s ar- 
gument used in the proofs of Lemma 6.1 and Theorem 7.1 ([5]). The latter can 
be proved by applying the idea of DiBenedetto and Hoff [4] to our difference 
scheme. Moreover, we obtained the interface equation (see Main Theorem in 
[11]). We state the basic estimates and the convergence of numerical solutions 
without proof. 

Theorem 1 (Basic estimates). Under Condition A assume u^ e Vh. Then 
u^ either becomes extinct or belongs to Vh for each n > 0, and the following 
estimates hold for all n >0: 



n <tn + (4.10) 

£o - a||(M°)x||ootTi <in<rn< + a|| (w° )x ||ooin, if 0. (4-11) 

0 < u^(a;) < max(||w°||oo - 0) on R\ (4-12) 

||K)x||oo < \\{ul)4oo, (4.13) 

TV{{ul)4 < TV{{ul)4, (4.14) 
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+c'(ro - 4 + 2a\\{ul)a;\\ootn), (4.15) 

inf 5\° < inf (4.16) 

ieTt ieTi 

where |1 • ||oo denotes |1 • 

Theorem 2 (Convergence of Numerical Solutions). Under Conditions A 
and B let {h} be an arbitrary sequence which tends to zero. Then, there exists 
the unique weak solution v of {1.1) -{1.2), and 

- v||ioo(T^) — > 0 and |T^ - T*\ — >0 as h 0, (4.17) 

where H = [0, oo) x Vh = Uh{t,x) = u^{x) on [f„,f„+i) x 

for all tn and h, and T* is the extinction time. 



3 Kersner’s exact solution 



When m + p = 2 and m > 1, there is Kersner’s exact solution [9] given by 
K{t,x,p,a) = {bit + b 2 {(j))~^ (2.1) 



ai{p,a){bit -\-b 2 {(r))^+^ - 02 {bit + b 2 {cr))'^ - x^ 



where 



c{m — 1)^(7^ + Arn?p^ c{m — 1)^ 

ai{p,a)= , 2 ^ « 2 - 



hi 



Am?{{m — l)cr}’T^+i 
2m(m + 1) 

— , b 2 {a) = {m - l)a, 



Am? 



m - 



( 2 . 2 ) 



(2.3) 



p > 0 and (j > 0 are arbitrary numbers and [g]^ = max{ 5 f, 0 }. This solution 
satisfies (1.1)-(1.2) with t’^(rr) = K{0,x, p,a) in the weak sense and becomes 
a classical solution in the open set wherein K{t,x, p,a) > 0. It is easily seen 
that supp K{0, x, p, a) = [— p, p] and the right and left interface curves and 
C- are written as follows, respectively: 



C±(f) = ±|ai(p,cr)(6if + &2(o-))"+i - a2{bit + b2{a-)fy 



(2.4) 



When 



cr < 



2p jTn 
(m — 1)2 Y c 



holds, supp K{t,x, p,a) expands for t G [0, T(p, cr)], where 



(2.5) 
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f{p,cr) = 



h 



[ ai(p,cr) ] 2 
\(m + l)a2 J 




( 2 . 6 ) 



For t > T(p, a) supp K{t, x, p, cr) shrinks and K{t^ x, p, a) identically vanishes 
at the extinction time t = cr) given by 





4 Pulse splitting phenomena with connecting property 

For a positive number 7 we define an even function v^{x) by 

= + on(-oo0], 

x) on (0,oo). ^ ^ 

Let 77 G (0, p)(0 < ?7 < p) be an arbitrary fixed constant and e be an arbitrary 
positive number such that s < iiT(0, p — p, p, cr). For u°(x) we introduce an 
even function v^{x) satisfying the following 

Condition C. i) u°(x) = v^{—x) and v^{x) < v^{x) hold on R^; 

ii) 

fi^(0,a: + /9 + 7,p,cr) on (-00,-7-77], 

\ £ on [-7,0], '' ■ 
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and v^(x) is a decreasing function on [—7 — 77, —7]; 

iii) < v^(x) holds for e' < 

iv) v^{x) is sufficiently smooth in its support [— 2p — 7, 7 + 2p] and 
||w°xlloo < llwSiloo, TV{ul^) < (TF(m°)), ess.inf > ess.inf (3.3) 

where vP{x) = {v^{x))^~^ and u^{x) = {v^{x))^~^ . 

Let v(t^x) and Vs{t,x) be the solutions of (1.1)-(1.2) with v{0,x) = v^{x) and 
Vs{0,x) = v^{x), respectively. Then we obtain the theorem, which justifies the 
appearance of the phenomena (NS-4). 

Theorem 3. Under Conditions A and C let 7, p and a be constants such that 
supp v{t^x) becomes connected at t = T{p,a). Then, for some e, there exist 
t (0 < t < T[p,(j)) and x (—7 < x < j) such that Vi(t,x) — 0 holds, and 
supp Vi{T{p,a), ■ ) is connected. 

Proof. Putting S = [0, T(p, cr)] x [—7,7 ], we show that S contains at least 
one point it,x) such that Vi{i,x) = 0 for some positive constant e. For this 
end we assume the contrary; that is, suppose v^{t, x) > 0 on S for s > 0. Then, 
the following estimates hold by Theorems 1 and 2. 



0 < Ue{t, •) < max(|l7/^||oo — c't, 0) on R^, (3.4) 

•) l|oO ^ ||'^£xl|o05 (^-^) 

{t,x)\dx < TV{u,,{t , .)) < (3.6) 

J-7 

where By using these inequalities and Condition C we obtain 

/ 7 rl 

Us{t,x)dx = / Ue{0,x)dx 
-7 J — 7 



-7 

J J {mue{t,x)usxx{t,x) + a{uex{t,x))‘^ - c'} dxdt 



+ 



= 2^8 
rt 



,m— 1 



- J | 27 c' - (m - 2)a J Us{t,x)uexx{t,x)dx - a^Us{t,x)usx{t,x)^ | 

< 27^"^“^ - / 27 c' - a m^x Ue{t,x)({2 - m)TF(u^) + 2|l7z°||oo) | 
f [0,t]x[-7,7] V / J 



dt 



a max 
[0,t]x[-7/ 

for t e [0,T(p, cr)]. 

Let d\ be an arbitrary fixed positive constant such that 

^ ^ 2tc^ 

a((2 - m)TV{ul) + 2||Mg||oo) 



t 

(3.7) 



(3.8) 
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Then, by the continuity of the solution Vsit^x) and the comparison theorem 
on the initial data([2]) there exist positive constants £i and T\ < T{p^a) such 
that 



max uJt^x)<di for t<T\ and £ < .Si < min (<ii, iC(0, p — 77, p, cr)) 
[0,t]x[-7,7] 

(3.9) 

We put 

T2{e) = i (3.10) 

{27c' - ad, ((2 - m)TV{ul) + 2||«g lU) } 

and choose e < £\ such that T2{s) < T\. Hence, it follows from (3.7) that 



/: 



Ui{t,x)dx <0 for /: G [T2(£),Ti], 



(3.11) 



which is a contradiction. Thus, Vi{i^ x) = 0 holds for some (f, x) G S. It is clear 
by the comparison theorem that supp Vi(T{p, cr), •) becomes connected, which 
completes the proof. 
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Summary. Fully two-dimensional, sixteen state, approximate Riemann solver of 
HLLEC type for Euler equations with sixteen states and central waves which ap- 
proximate contact discontinuities has been developed. The Riemann solver is used in 
Godunov and WAF finite difference schemes. Numerical example of their perfomance 
is presented. 



1 Introduction 

The aim of this paper is to describe the class of HLLE Riemann solvers and 
propose a fully two-dimensional extension of the one-dimensional HLLC Rie- 
mann solver from [8]. 

In 1983, Harten, Lax and van Leer suggested [3] to approximate the solu- 
tion of a Riemann problem by three constant states, separated by two waves, 
propagating with constant speeds. A particular algorithm for computation of 
these wave speeds were presented five years later by Einfeldt [2]. Since the 
assumption of two waves is correct only for hyperbolic systems of two equa- 
tions, Toro, Spruce and Speares [8] added one more wave, creating the 4-state 
one-dimensional HLLC solver. In fluid dynamics, this new wave corresponds 
to a contact discontinuity. Later, Wendroff presented a series of 9-state solvers, 
extending the 3-state HLLE approach to two dimensions [10], [11]. In this pa- 
per, we construct a contact-corrected version of this approach, adding six new 
waves. 

The outline is as follows: First, we introduce the basic principles of HLLE 
and HLLC Riemann solvers in the simple one-dimensional case. Then we pro- 
ceed to two dimensions, describe the 9-state solver from [10], [11] and introduce 
its new, contact-corrected version. Finally, we demonstrate the application of 
our new solver in two particular numerical methods, namely the Godunov- 
type difference scheme and the WAF (Weighted Average Flux) approach, and 
present results of selected numerical tests. 
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Fig. 1. ID solvers: (a) 3-state HLLE; (b) 4-state HLLC 



2 One Dimension 

Methods presented in this paper have been derived for the Euler equations. In 
one dimension we have 



P \ pu 

+ pv? -f p 



0 , 



( 1 ) 



where p is the density, u fluid velocity, p pressure and E the density of the 
total energy. Completing the system (1) by the equation of state for an ideal 
polytropic gas p = ( 7 — 1 ) {E — ^pu "^) , we can write it in the general differential 
form wt -f f (w)a: = 0. 

Consider the initial Riemann problem 



■{x,ti) = 



(Wo 

IVPi 



(po, pouo, EoY' 
{pi,PiUi,Eif 



for 

for 






<X< 

<X< Xi+I 



( 2 ) 



Following [11], we wiU approximate the solution at + At with 

three constant states Wq, Wi and W\, divided by two waves, propagating 
with constant speeds iP and }p. The situation is shown in Fig. 1(a). With 
this layout and notation, the integral form of the conservation law over this 
space-time domain becomes 



Ax 



{Wo + Wi) 



Ax 



-b }P At ) Wo -b (6^ At -6° At) Wi + 



( 3 ) 



+ At) Wi + [/(Wi)-/(Wo)j At. 



Note that data below the plots in Fig. 1 show the absolute position, while data 
above the plots are relative distances to the center of the staggered cell. Such 
simplified notation will be especially advantageous later in 2D. 

Solving (3) for W\ gives 

biWi-b°Wo-/(Wi)-b/(Wo) 



( 4 ) 
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To have the scheme completely determined, we must decide, how to choose 
wave speeds and b^. We will again follow [11] and use the Einfeldt speeds, 
based on Roe averages, as summarized in [7]. Now we know all the states and 
wave speeds of the approximate Riemann solver. Such a solver can be used in 
several difference schemes, as we will see in Sect. 4. 

As mentioned above, Toro et al. [8] extended this solver to a 4-state ver- 
sion, called HLLC, which resolves stationary contact discontinuities exactly. 
The main idea is to split the intermediate state by a third wave, representing 
a contact. Since velocity and pressure stay unchanged across contact disconti- 
nuities, the two central states will differ only in density (and, of course, total 
energy, but this can be computed from pressure and other known state vari- 
ables). So, consider the initial Riemann problem (2). Then the 4-state solver 
can be summarized by the following algorithm: 

1. Compute Einfeldt wave speeds 6*^, b^ from Roe averages as in [7]. 

2. Compute the intermediate state from (4) the same way as in the 3-state 
solver (Fig. 1(a)). Denote this state Wi = 

3. Compute ui, pi and pi from Wi. 

4. Assume ui and pi to be preserved across the contact discontinuity. Then 
the contact has to move also with speed ui and densities in the two new 
intermediate regions can be computed from the scalar Rankine-Hugoniot 
condition for the mass conservation, applied to the left, resp. right wave: 

1^0 — b^ ui — b^ 

2 2 

5. Compute both intermediate states (Fig. 1(b)): 




(5) 



Note that there is also another way to evaluate pi^Ei j^ and Ei ji only from 
Rankine-Hugoniot conditions [6, 7, Ij. 



3 Two-dimensional Riemann Solvers 

In two dimensions, the Euler Equations have the form 
/ p\ / pu \ ( pv \ 



pu 

pv 

\EJ 



+ 



( pu \ 
pu^ -f p 
puv 
\{E + p)uJ ^ 



+ 



pv 
puv 

pv‘^ -}- p 

\{E + p)vJ , 



= 0 



( 6 ) 
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'’1/2, 1/2^' 
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^1/2,0^^ *^1/2,(A^ ^1/2,(A* 

, x,min 

'’l/2,l/2^t 

(b) 



Fig. 2. 2D Riemann solvers: (a) 9-state HLLE with highlighted initial condition; (b) 
16-state HLLEC 



with the equation of state for ideal polytropic gas p — (7 — l)(jE — 
where v is the fluid velocity in the y-direction. 

First, let us summarize the fully two-dimensional 9-state HLLE Riemann 
solver, originally presented in [11]. We want to solve the 2D Riemann problem, 
formed by two partitions at x = and y = yj-f-i, subdividing the domain 
[xi^Xi^i] X [yj,yj_}_i] into four regions with constant states. As the system 
evolves in time, we approximate the solution as shown in Fig. 2(a). Waves, 
propagating from the center with constant speeds, split the domain into 9 re- 
gions. For At sufficiently small, none of the waves in the ^-direction will reach 
the edge y = so that the state Wi q is affected only by the one-dimensional 
Riemann problem in the a;- direction, given by Wo,o and Wi^q. Analogous con- 
siderations along the other three edges lead us to following method: The corner 
states ( Wo,05 Wi,o, Wo,i, Wi^i) stay undisturbed. We take these to compute 
the edge states {Wi q, Wi using a 3-state solver based on the 

one-dimensional hL'LE solver from section 2. Then we do not need to solve 
the real two-dimensional problem in the central area, since Wi 1 is given by 
a two-dimensional conservation law, applied to the whole staggered cell. 

Such 9-state Riemann solver works well for problems containing shocks and 
rarefaction waves, but its weakness is poor resolution of contact discontinuities. 
Our goal was to modify it to a solver which resolves contact surfaces better 
(stationary contacts even exactly), and which degenerates to the ID 4-state 
solver from Sect. 2 for one-dimensional initial problems in coordinate directions 
(i.e. if there is only one interface in the 2D initial condition). 

The most straightforward way is to follow Toro’s one-dimensional approach 
and split each of the intermediate edge regions (i.e. states Wi q, ^0,^? 

Wi 1 ) by a wave corresponding to a contact discontinuity. It is obvious that 
to fulfill the degeneracy condition mentioned above we must subdivide also 
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the central region (i.e. state ^i,i) by two additional waves. This gives us the 
setup for a 16-state solver as shown in Fig. 2(b). 

It is not difficult to compute speeds of splitting waves and values of new 
intermediate states along the edges. Basically, we use the same process as in 
one dimension (5), taking also transversal velocity into account. In [9] it is 
proven that we can simply copy these from the nearest corner states, that is 
along the edge y = yj we simply take — '^o,o and along 

edge X = Xi we take Uq i ^ = i^o,o and ^ etc. 

What we have to do is to set speeds ol waves splitting the central region. 
One possibility is to take the velocities from the conservation laws. We compute 
the central state as if it would not be further subdivided, then divide the 
momentum in each direction by density and use these velocities as speeds of 
the splitting waves. Besides this one, there are several other approaches to set 
these partitions. 

The only remaining issue is to compute new states in the four central re- 
gions. First, we estimate the velocities. To fulfill the condition of degeneracy, 
we take for each direction the average of velocity in edge states and corner 
states with interface parallel to this direction, weigthed by length of these in- 
terfaces. That means, for i we take the velocities 
and weighted by the lengths of the interfaces of their regions with up- 
per right central region, while for ?;i i we taked weighted average of vi^o-, 
1 ^ u vi^i. Then we estimate central densities. This time, for each 

central state, we compute the density from ID Rankine-Hugoniot conditions 
across each interface with surrounding regions (i.e. with corner and edge re- 
gions), and weigh the results again by lengths of interfaces. Since we know the 
total mass in the central states, which must be conserved, we multiply now all 
the four density estimates by a suitable factor to satisfy the mass conservation 
law. Similarly, we correct the momentum estimates (corrected density multi- 
plied by estimated velocity) in both directions by adding a suitable amount to 
each state, in order to keep momentum conserved. Finally, we suppose pres- 
sure to be the same in all four central regions and compute its value from the 
conservation law for total energy. 

Let us summarize the complete algorithm of our new 2D 16-state HLLEC 
Riemann solver (consult Fig. 2(b)): 



1. Compute wave speeds , 6^ i , 6^ i , 6^ i and h\ i , using 

2 2 5 ^ 2 ’^ 2 ’^ ^’2 ^’2 ^’2 -*^’2 

Einfeldt formulas from [7]. 

2. Compute the intermediate (light shaded) states for each of four ID prob- 
lems along edges. 

a) Using the ID 4-state HLLC solver with appropriate 2D fluxes, com- 
pute the longitudinal velocities (i.e. both 

densities (i.e. pi,o,L 5 P|,o,i ?5 Po,^,u^ Pi,\,l 

Pi.h.u)' 
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b) Copy the transversal velocities always from the nearest corner state 
(i.e. ,o,L — '^ 0 , 0 , %,i,L — '^ 0 , 0 , etc. 

c) Compute the intermediate pressures (i.e. pi q, pi i, Po i, Pi i ) so, that 
the total energy in “ID-stripe” along each edge is conserved. 

3. Compute the central state Wi i from the 2D integral conservation law 
as if it was not further subdivicied. These values will be used to assure 
conservation and split the central region. 

4. Split the central region into four pieces (UL, UR, LL, LR) divided by waves 
with speeds based on velocities from the previous step. 

5. Estimate eight central velocities (i.e. ui i^^^ ae{UL,LL,UR,LR}) 

from adjacent edge and corner regions, so that t?ie solver degenerates to 
ID solver for ID initial conditions in x-, resp. y-direction. 

6. Estimate four central densities (i.e. pi a^{UL,LL,UR,LR})^ using Rankine- 
Hugoniot conditions across the interfaces with surrounding (edge and cor- 
ner) states, weighted by lengths of particular interfaces. 

7. Correct central densities from step 6 so, that the integral mass conservation 
law over the whole staggered cell is satisfied. 

8. Correct central velocities from step 5 so, that integral momentum conser- 
vation laws in both coordinate directions over the whole staggered cell are 
satisfied. 

9. Finally, compute central pressure pi i (assuming it is equal for all four 
central states) so, that the total energy is conserved. 

4 Application in Difference Schemes 

In this section, we demonstarte how our new Riemann solver can be practically 
used in difference schemes. In particular, we first present the simplest, first 
order Godunov-type method and then a more accurate WAF scheme. 

4.1 Godunov- type Scheme 

The scenario of the simplest Godunov scheme is as follows: 

1. Construct the uniform rectangular mesh with spatial steps Ax and Ay, 

with cells centered at [xi^yj). Dual (staggered) mesh will be formed by 
cells centered at (x^_^i , ). 

2. Discretize the initial condition, so that the values are constant inside each 
original cell. The interfaces form 2D Riemann problems in staggered cells. 

3. Choose the time interval At so, that these Riemann problems stay sepa- 
rated in their respective staggered cells during the whole time step. 

4. Let the Riemann problems evolve and compute their approximate solutions 
after At with the HLLEC Riemann solver from Sect. 3. 

5. Create new constant states by integrating the solutions over the original 
cells. (For us, this means only weighted averaging.) 

6. Repeat from step 3 until the solution at desired time is reached. 
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Fig. 3. ID WAF scheme with a 4-state approximate Riemann solver 



4.2 WAF Scheme 

Here we apply our 2D Riwmann solver in Weighted Average Flux (WAF) 
methods [6, 7]. We want to solve the system of Euler equations in one dimen- 
sion. We start as with the Godunov approach: we create a uniform mesh and 
replace the initial condition by a piecewise constant function, so that it forms 
a set of Riemann problems at cell interfaces. Then we let the system evolve in 
time and apply the approximate 4-state HLLC solver on each Riemann prob- 
lem as in the Fig. 3. To obtain the value in cell center Xi at new time level 
(^^+1 = p + At), we use conservative formula 

where F'^ i and i are weighted average fluxes in the staggered cell at left, 
I 2 2 

resp. at right. These can be computed at each time level from real physical 
fluxes in four states of the solver as 

( 8 ) 

k=l 



To avoid excessive oscillations, weights are taken as 



2 



with — 1, = 1, and the limiter functions 



= sgn , k = l,...,3. 



( 9 ) 

( 10 ) 



One can use the van Leer limiter, minmod^ or many others listed for example 
in [7], 

In two dimensions, the conservative formula is 




and we compute each weighted average flux from cell-sized region centered 
at the point given by its indices (i.e. flux from rectangle [xi^Xi^i] x 
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yj+i/2 



(a) (b) 

Fig. 4. 2D WAF scheme: (a) computation of (b) computation of 

[2/j_ 1 , etc.). Rather than to introduce 2D limiters and derive the WAF 
method in two dimensions, we use the ID WAF approach in each appropriate 
direction, with each of the four fluxes taken as an average from a suitable set 
of states. This is shown schematically in Fig. 4 and described more precisely 
in [9]. 



5 Numerical Results 

In Fig. 5, we can see results for the density, as computed by the Godunov 
scheme with 9-state solver, and by both Godunov and WAF schemes with the 
new HLLEC solver. The test is a slightly modified version of Test 16 from [4], 
originally presented and described in [5]. The domain [0, 1] x [0, 1] is split by 
two partitions, located at a; = 0.5 and y = 0.5, into four regions with constant 
states, forming a 2D Riemann problem. Initial values 



PUL = 1-0222 PUL = 1-0 

«t/L = -0.7179 vuL = 0.1 


pUR = 0.5313 Pur = 0.4 

'^UR = 0 VUR = 0.1 


Pll = 0.8 Pll — 1.0 

ull = 0 vll = 0.1 


Plr = 1.0 plr = 1.0 

'^Li? = 0 t’Li? = 0.8276 



are chosen so that the analytical solution of one-dimensional problem along 
each edge of the domain is a simple wave: from the top in clockwise direction: 
left-propagating rarefaction, upward moving shock, stationary contact, upward 
moving contact. 

As expected, our new two-dimensional HLLEC Riemann solver gives better 
results than the 9-state HLLE. The most visible improvement is a sharper 
resolution of contact discontinuities, the stationary contact is resolved even 
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Godunov with HLLE (9-st.) Godunov with HLLEC (16-st.) 




0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 

CFL=0.49, 339 time steps, 400x400 mesh vertices CFL=0.49, 347 time steps, 400x400 mesh vertices 



(a) (b) 



WAF with HLLEC (16-st.) 




0 0.2 0.4 0.6 0.8 1 

CFL=0.49, 352 time steps, 400x400 mesh vertices 



(c) 

Fig. 5. Contour plots of density at time ^ = 0.2 as computed by particular difference 
schemes 

exactly. However, also shock and rarefaction wave are also treated better by 
the new solver. The price we have to pay for this improved resolution are 
slight but visible artifacts at initial locations of discontinuities, in the middle 
of the upper edge in Fig. 5 (b),(c). Comparing both schemes which use the 
HLLEC solver, we see that WAF performs better than Godunov, which is not 
surprising, since it is a second order accurate scheme. 
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Summary. We describe the algorithm to implement a deflation acceleration in a pre- 
conditioned Conjugate Gradient method to solve the system of linear equations from 
a Finite Element discretization. We focus on a parallel implementation in this pa- 
per. Subsequently we describe the data-structure. This is followed by some numerical 
experiments. The experiments indicate that our method is scalable. 



1 Introduction 

Large linear systems occur in many scientific and engineering applications. Of- 
ten these systems result from a discretization of model equations. The systems 
tend to become very large for three-dimensional problems. Some models in- 
volve time and space as independent parameters and therefore it is necessary 
to solve such a linear system efficiently at each time-step. 

In this paper we only consider symmetric positive definite (SPD) discretiza- 
tion matrices. Since the matrices are sparse in our applications, we use an 
iterative method to solve the linear system. In order to get a fast convergence 
of the method we use a preconditioned Conjugate Gradient method, where 
incomplete Choleski factorization is used as a preconditioner. This method is 
very suitable for parallellization. 

The present study involves a parallellization of the Conjugate Gradient 
method in which the inner products, the matrix- vector multiplication and pre- 
conditioning are parallellized. This parallellization is done by the use of domain 
decomposition, where the domain of computation is divided into subdomains 
and the overall discretization matrix is divided over the subdomains. To each 
subdomain we allocate a processor. A well-known problem is that the parallel- 
lized method is not scalable: the number of CG-iterations and wall-clock time 
increase as the number of subdomains increases. To make the method scal- 
able one uses a coarse grid correction (see for an overview and introduction 
Smith et al [6]) or the deflation method. In [3] it is shown that deflation gives 
a larger acceleration to the parallel preconditioned CG-method. The idea to 
use deflation for large linear systems of equations is not new. Among others, 
Nicholaides [4] and Vuik et al [8, 1, 10, 9] apply this method to solve large ill- 
conditioned linear systems. The result of deflation is that the components of 
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the solution in the direction of the eigenvectors corresponding to the extremely 
small eigenvalues are projected to zero. The effective condition number of the 
resulting singular system becomes more favourable. In the present paper we 
deal with ’’algebraic” deflation vectors. For more details on the various types 
of deflation vectors we refer to [8] and [7]. 

We assume that the domain of computation Q consists of a number of 



disjoint subdomains i G m}, such that 

subdomain we allocate a processor and a deflation vector, 
define 

1, for (x, 2 y) G 
0, for (x, y) G i? \ i7j. 






= To each 
for which we 



( 1 ) 



In case of Finite Volume methods we have to distinguish between cell-centered 
and vertex-centered discretization. In the cell-centered the deflation vector is 
not defined on the interfaces between consecutive subdomains. In the vertex- 
centered case, however, we have an overlap at the interface points. In this paper 
we use a Finite Element discretization, which is always vertex- centered by its 
construction. The subdomains may be considered as ’’super” -elements consist- 
ing of a set of finite elements. The global stiffness matrix is never constructed, 
only the ’’super” -element matrix is constructed. Matrix- vector multiplication 
is carried out per ’’super” -element and only after adding of the contributions of 
each ’’super” -element the global vector is obtained. In this way parallellization 
of the Finite Element method can be done in a natural way. For the interface 
points we use the concept of ’’average overlap”, which is explained as follows: 
Given a deflation vector Zj on an inter facial node that is shared by Qj and p 
neighbours of then we set at this point: 



p+1 



( 2 ) 



The deflation method is applied successfully to problems from transport in 
porous media where coefficients abruptly change several orders of magnitude 
[9]. In the present paper we consider a Galerkin Finite Element discretization 
of the Laplace equation with a Dirichlet and a Neumann boundary condition 
at Fd and Fjsf respectively (note that /V U Fd = dQ)\ 



( -Au = /, {x, y) &Q 

I Q'ljj ( ^ ) 

I M = u{x, y), for {x, y) G rp, — = 0, for (x, y) e Fn 

where u denotes the solution and u represents a given function. The resulting 
discretization matrix is symmetric positive definite. The domain is divided 
into subdomains and the resulting system of linear equations is solved by the 
use of a parallellized Deflated ICCG. In the text the algorithm is given and 
the issues of data-structure for the parallellization of the solution method are 
described. Subsequently, we describe some numerical experiments. For more 
mathematical background we refer to [1, 10, 7]. 
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subdomain 1 subdomain 2 

Fig. 1. Domain decomposition for a vertex centered discretization 

2 Deflated Incomplete Choleski preconditioned 
Conjugate Gradient Method 

In this section we describe the deflated method for the symmetric positive def- 
inite discretization matrix A. Let Z = (zi . . . Zm) represent the matrix whose 
columns consist of the deflation vectors zj, as defined in equations (1) and 
(2). The matrix Z is chosen such that its column space approximates the 
eigenspace of these eigenvectors that correspond to the smallest eigenvalues. 
We, then, define the projection P := I - AZ(Z^AZ)~^Z^ := I - AZE~^Z'^. 
It is shown in [7] that the matrix PA is positive semi-definite (and hence singu- 
lar). Kaasschieter [2] showed convergence of the Conjugate Gradient method 
for cases in which the matrix is singular. Let b be the right-hand side vector 
and X be the solution vector, then we solve 

Ax = b. (4) 

After application of deflation by left multiplication of the above equation by 
P, we obtain 



PAx = Pb. 



( 5 ) 
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Since PA is singular the solution is not unique. We denote the solution that is 
obtained by use of the ICCG method on equation (5) by :r. To get the solution 
X we use 



X = {I — P^)x + P^x. 



( 6 ) 



It is shown in [7] that P^ x = hence the solution of equation (5) can 

be used. The second part (/ — P'^)x — Z{Z'^AZ)~^Z'^b is relatively cheap to 
compute. Hence the solution x is obtained by addition of the two contribu- 
tions, i.e. X = P'^ X Z {Z'^ AZ)~^ Z"^ b. For completeness we give the algorithm 
of the Deflated ICCG: 



Algorithm 1 (DICCG [9]): 
= 0,£o = Pro, 2 ^ = lo = 
while ||ffci |2 > £ 



k = k + l, ak = 

+ aPfc. 



Lk-1 ±k-l 



Lk ~ 



- akPAp^ 

rt 



■ Lk-l 
~T 

a 1-k ^k 

/?fc = - 



d-i Zk-i 



Ek-i=^k + Pw^ 

end while 



The inner products, matrix vector multiplication and vector updates in the 
above algorithm are easy to parallellize. Parallellization of the incomplete 
Choleski preconditioned Conjugate Gradient method has been done before by 
Perchat et al [5]. We use a restriction and a prolongation operator and block 
preconditioners for the preconditioning step in the above algorithm. Note that 
it is necessary to have a symmetric preconditioner. This is obtained by choosing 
the restriction and prolongation matrices as transposes of each other. Let rj^ be 
the residual after k CG-iterations and N be the total number of subdomains, 
then overall preconditioning is expressed in matrix form by: 

lk=(j2^fMr^R}jrk. (7) 

Here Zj^ represents the updated residual after preconditioning. Further, Ri and 
Mi respectively denote the restriction operator and block preconditioner. We 
will limit ourselves to the issues of the implementation of the data-structure 
needed for the parallel implementation of deflation. The above algorithm is 
a standard ICCG except for the lines that contain the matrix P. 



3 Data-structure of the deflation vectors for 
parallellization 

To create P and Pv we need to make Zj and to compute Azj, zj Azj and Pv. To 
do this efficiently we make use of the sparsity pattern of the deflation vectors Zj . 
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We create the vectors Zj in the subdomain Qj only and send essential parts to 
its direct neighbours. We explain the data-structure and communication issues 
for a rectangular example. In the explanation we use the global numbering from 
the left part of Figure 2. Note that in the implementation the local numbering 
is used in the communication and calculation part. The global numbering is 
used for post-processing purposes only. The example can be generalized easily 
to other configurations. The situation is displayed in Figure 2. 
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Fig. 2. A sketch of division of Q into subdomains f?i, . . . , i? 4 . Left figure represents 
the global numbering of the unknowns, right figure represents the local numbering. 



In Figure 2 on each subdomain Qi a deflation vector zi is created. For 
all the interface nodes (say numbers 3, 8, 13, 12, 11 of Q\) the number of 
neighbours is determined, then equation (2) is applied to determine the value 
of the corresponding entry of z\. This implies that it is necessary to have a 
list of interface nodes for each subdomain and to have a list with the number 
of neighbouring subdomains on which a particular interface node is located. 
However, this information is not sufficient. For example the vector Az\ must 
be computed and also be multiplied with vectors Zj . The vector Az\ has non- 
zero entries not only inside the domain Q\ and on its interfaces, but also in its 
direct neighbours, i.e. the points 4, 9, 14, 16, 17, 18, 19. Therefore we also need 
to extend the vectors z\ with these points to have a well defined matrix- vector 
multiplication. In the same way the vectors Azj^ j G {2,3,4} have non-zero 
entries in i?i. For the other deflation vectors we proceed analogously. 

We further explain the computational part which is relevant to processor 
1, i.e. subdomain coi only, the other subdomains are dealt with similarly. For 
example Az 2 will have a non-zero contribution in all interface points of 
and i ?2 but also in the points 2, 7 and 12. All vectors in common points of 
any subdomain and are given the global value, i.e. the value that is the 
result of addition. This requires communication between and this particular 
subdomain. This is in contrast to the matrix A, which is stored only locally 
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per subdomain without the addition at common interfaces. So to compute 
Az2 on i?i we need an extra list of neighbouring points of i?2 in domain 
that are not on the common interface. This, however, is not sufficient. For 
example the value of Az2 in nodal point 12 also has a contribution of i?4. So 
in we need an extra list of points on common sides of and Qj that are 
direct neighbours of Qk (j ^ k) but are not on the interface of i?i and Qk- 
For example node 12 is a common point of and 1?4 and a neighbour of 
further 1?4 is a neighbour of i?2. After communication and addition of the 
values of Az at these particular nodes, the matrix consisting of the inner 
products, is calculated and sent to processor 1. 

Then, for a given vector v we compute at each processor its inner product 
with Zi {zfv). Then all these inner products are sent to processor 1 (l?i) where 
y = E~^ (z^v Z2V z'^v zjv)'^ is computed by Choleski and subsequently the 
results are sent to all the other neighbouring processors. Then, Pv = v — ZAy 
is computed locally. All the steps are displayed schematically in algorithm 2, 
where we explain the situation for a case with two processors. Note that E 
has a profile structure where the profile is defined by the numbering pattern 
of the subdomains. Hence for a block structure in two dimensions we obtain 
a similar sparsity pattern for E as for a two-dimensional discretization. If, 
however, a layered structure is used, then E gets the same pattern as a one- 
dimensional discretization matrix. 

Algorithm 2 (Parallellization of Deflation) P = I — AZ {Z'^ AZ)~^ Z'^ . 
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4 Numerical experiments 

To illustrate the advantage of the deflation method we present the number 
of CG-iterations and wall-clock time as a function of the number of layers 
(left and right graphs respectively in Figure 3). We start with one layer and 
extend the domain of computation with one horizontal layer, which is placed 
on top. This is done consecutively up to 7 layers. In the examples we choose 
the number of elements the same in each layer. The problem size increases 
as the number of layers increases. It can be seen that if deflation is not used 
then the convergence will take more time since the number of CG iterations 
increases. The use of deflation yields that the number of iterations and wall- 
clock time for the parallel case does not depend on the number of layers. This is 
also observed for the number of CG-iterations for the sequential computations. 
This makes the method scalable. 

Further, we present the number of iterations as a function of the number 
of layers as in the preceeding example for three methods: no projection, coarse 
grid correction and deflation. The results are shown in Figure 4 (left graph). It 
can be seen that both the coarse grid correction and the deflation methods are 
scalable, however deflation gives the best results. This is in agreement with the 
analysis as presented in [3] where it is proven that the deflated method con- 
verges faster than the coarse grid correction. The same behaviour is observed 
if the domain is extended in a blockwise distribution of added subdomains (see 
the right graph in Figure 4). 




— deflation seq. 

. no deflation par. 

deflation par. 











3 4 5 

number of layers 



Fig. 3. Left figure: The number of iterations as a function of the number of layers for 
deflated and non-deflated parallellized and sequential ICCG method. Right figure: 
The wall clock-time as a function of the number of layers for deflated and non-deflated 
parallellized and sequential ICCG method. 
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Fig. 4. The number of iterations as a function of the number of layers for the 
parallellized ICCG method for three methods: no projection, coarse grid correction 
and deflation. Left graph: layered extension of the domain of computation, Right 
graph: blockwise extension of the domain of computation 



5 Conclusions 

The deflation technique has been implemented successfully in a parallellized 
and sequential ICCG method to solve an elliptic problem by the use of finite 
elements. The domain decomposition can be chosen blockwise and layerwise. 
Some numerical experiments are shown in the present paper. Further, the 
number of iterations and wall-clock time become independent of the number of 
added layers if deflation is applied in a parallel ICCG method. Hence deflation 
is favourable in both sequential and parallel computing environments. 
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Summary. Checkpointing techniques become more and necessary for the compu- 
tation of adjoints. This paper presents the more common multi-level checkpointing 
as well as the less known binomial checkpointing. The checkpointing approaches are 
compared with respect to the number of time steps the adjoint of which can be calcu- 
lated, the run-time needed for the adjoint calculation and the memory requirement. 
Some examples illustrate the shown results 



1 Introduction 

For many time-dependent applications, the corresponding simulations can be 
performed using ordinary or partial differential equations. Furthermore, quite 
often there are quantities that influence the result of the simulation. Through- 
out, we assume that these quantities are control functions, for example heating 
in and/or at the boundary of a domain. To compute an approximation of the 
simulated process for a time interval [0,T], one applies an appropriate inte- 
gration scheme given by 

yo = y°, i = 

where yi G denotes the state and ui G the control at time ti for a given 
time grid to, ... with to = 0 and = T. The operator Fi : R’^ x R’^ x R 
R’^ defines the time step to compute the state at time U. Note that we do not 
assume a uniform grid. To optimize a specific criterion or to obtain a desired 
state, the cost functional 



J{u) = J{y{u),u) 

measures the quality of y{u) and u = {ui, . . . ,un). 

Here, y{u) = (yi(u), . . . , yN{u)) describes the dependence of the state y on the 
control u. For applying a calculus-based optimization method, one may use an 
adjoint integration 

Vn= 0, Vi-1 = i = (1) 
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motivated by the adjoint differential equation that belongs to the differen- 
tial equation describing the state. Subsequently or concurrently, the desired 
derivative information Ju{u) can be reconstructed from y = (yo, • • • , ^iv). The 
specific choice of the adjoint steps Fi depends on the forward integration and 
whether one prefers the continuous adjoint or the discrete adjoint formulation, 
see e.g. [7, 3, 5]. For the purpose of this paper, it is only important to note 
that the adjoint integration has to be performed backwards in time and that 
the complete forward trajectory y = (t/o, • • • , is required. Hence, storing 

all states (yo, • • • iVn-i) during the forward integration and reading them in 
reverse order during the adjoint integration forms one simple possibility to 
overcome this difficulty. Then the computing time for the adjoint calculation 
consists of the evaluation of N time steps Fi storing the state yi-i and the 
evaluation of N adjoint steps Fi. 

The storage requirement of the basic approach to calculate adjoints is pro- 
portional to the number N of time steps. If we want to calculate the adjoint of 
a real-world problem with thousands of time steps this memory requirement of 
the basic approach may become a serious problem. For example, for computing 
3D flows with unstructured grids one may need easily 10 to 100 MBytes to 
store only one state vector yi [10]. Therefore, it is reasonable to assume that 
due to their size, only a very limited number of intermediate states can be 
kept in memory. They may serve as checkpoints, such that the required infor- 
mation for the backward integration is generated piecewise during the adjoint 
calculation. Sections 2 and 3 present two different checkpointing techniques. 
The resulting run-times and memory requirements are compared in Section 4. 
Finally, some conclusions and an outlook are given in Section 5. 



2 Uniform Checkpoint Distribution 

To distribute the checkpoints equidistantly over the given number of time steps 
forms one obvious solution to the storage requirement problem. Subsequently 
the adjoints are computed for each of the resulting groups of time steps sepa- 
rately. Denoting the number of checkpoints used by c, the corresponding cal- 
culation of the adjoint values can be performed using the following algorithm 
where the counter i is identified with the state 
Two-level Checkpointing 

Initialization: Reserve space for ci checkpoints, store the initial state yo 
in the first one and set 

_ / \N/{c, F 1)1 if Cl \N/{c, + 1)1 < iV 

^ I + 1)J ®lse 

Advance: Starting from the initial state, advance to state ci • C 2 by per- 
forming the time steps Fi, 1 < i < ci • C 2 . While integrating forward, store 
the states {j — 1) C 2 in the checkpoints j for j = 2 , . . . , ci. 
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Reverse: 
do p = Cl, 0, -1 

Evaluate the time steps Fi^ p • C 2 < i < N storing the states i, p • C 2 < 
i <N -1, 

perform the adjoint steps Fi, N > i > p • C 2 to calculate the adjoints, 
set == p • C 2 , if p > 0 read the contents of checkpoint p. 
end do 

Fig. 1 sketches the two-level checkpointing for A/" = 16 time steps and 
c = Cl + C 2 = 6. Throughout, the time steps are plotted along the vertical 
axis and the computing time required for the adjoint calculation is represented 
by the horizontal axis. Each solid horizontal line including the horizontal axis 
itself represents a checkpoint. The time, when a state is stored in a checkpoint, 
is marked with a black circle for the first level and with a black square for the 
second level. The slanted black lines represent the evaluation of time steps. The 
adjoint steps are drawn as dashed slanted lines. Finally, black arrows depict 
the usage of a state yi for an adjoint step Fi^i without performing the corre- 
sponding time step F{. This adjoint calculation is possible due to the assumed 
structure (1) of the adjoint steps. Note, that it may be required to evaluate 
Fjsf once to initialize the adjoints. This evaluation can be introduced right af- 
ter the evaluation of Ejv-i for p — ci. For illustration purposes, we suppose 
throughout that all time steps and all adjoint steps have the same temporal 
complexity normalized to 1. However, to apply the presented optimal check- 
pointing techniques, only the identical temporal complexity of all time steps 
is required. In this example, 24 time steps are performed. Hence, the number 

1 10 20 30 40 



10 



Fig. 1. Two-level checkpointing for N = 16 time steps and c = ci + C2 = 6 check- 
points 

of additional time step evaluations caused by the two-level checkpointing com- 
pared to the basic approach equals 9. Furthermore, at most 6 states have to 
be kept in memory. 

The two-level checkpointing has been proposed several times in the litera- 
ture, e.g. [11, 1], and is easy to implement. Naturally, one can apply two-level 
checkpointing repeatedly for the groups of time steps that are separated by 
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equidistant checkpoints. This approach is called multi-level checkpointing [6] 
and sketched by Fig. 2 for the three-level case. The multi-level checkpointing 
is defined by the number of levels r, the number of checkpoints ci that are 
uniformly distributed at level z, i = l,...,r — 1, and the number of states Cr 
that have to be stored at the highest level r. Hence, the parameters of the 
adjoint calculation shown in Fig. 2 are c\ — 2, C2 = 2, and C3 = 1. For a given 
r-level checkpointing, one easily derives the following identities 

^ = rAT-^]^(cj + l) , 

i=l 3 = 1 

39^i 

where Nr denotes the number of time steps for which the adjoint can be 
calculated using the specific r-level checkpointing. The corresponding memory 
requirement equals Mr. The number of time step evaluations required for the 
adjoint calculation is given by Tr, since at the first level ci /(ci-|-l) time steps 
have to be evaluated to reach the second level. At the second level, one group 
of time steps is divided into C2 + 1 groups. Hence, C 2 {Nr/ci -h l)/(c2 -h 1) time 
steps have to be evaluated in each group to reach the third level. Therefore, we 
obtain (ci -f- l)c2(A^r/ci + l)/(c2 + 1) = C 2 Nr/{c 2 + 1) at the second level and 
so on. It follows that each time step Fi is evaluated at most r times. Hence, 
if we apply two-level checkpointing, each time step is evaluated no more than 
two times. 

The two- as well as the multi-level checkpointing technique have the draw- 
back that at each level the checkpoints are not reused. Each checkpoint stores 
at each level only one state and becomes idle as soon as the data that is stored 
in the checkpoint has been used for the adjoint calculation. A method that 
reuses the checkpoints as soon as possible is proposed in the next section. 



Nr = + 1 )’ Mr = 'Y^Ci, Tr 



Ci N 



i=i 



i=l 



Ci + 



3 Binomial Checkpoint Distribution 

When one applies the checkpointing technique proposed in [5], the adjoint 
values are again generated piece by piece but only one state is employed for 
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the adjoint calculation at any time. Therefore, the checkpointing procedure 
has to be adapted as follows: 

Binomial Checkpointing 

Initialization: Reserve space for c checkpoints and store the initial state 
yo in the first one. 
do p = 1, -1 

Advance: Starting from the last checkpoint assigned, advance to state 

p- 1. 

If checkpoints are free, set as many of them as possible to states i along 
the way. 

Reverse: Perform the adjoint step Fp to calculate the adjoint. 

If state p — 1 is stored in a checkpoint, free the checkpoint up. 
end do 

The memory requirement of this checkpointing procedure equals Mb = c. Nat- 
urally, the question arises where one should place the checkpoints in the action 
“Advance” of the algorithm to minimize the number of time step evaluations. 
The application of the routine revolve ensures that the initiated checkpointing 
process is provably optimal with respect to the run-time increase for a given 
number of checkpoints [5]. More specifically, for the structure (1) of the adjoint 
steps considered here, the following complexity result holds: 

Theorem 1. Let N he the total number of time steps for which the adjoint has 
to he calculated. Suppose, up to c checkpoints are available at any time. Then 
the minimal number of time step evaluations needed for the adjoint calculation 
equals 



Tb = Nr 



c + r\ 
r-1/’ 



where r the unique integer satisfying 



c + r — 1 
r — 1 



< N < 



C + r 
r 



(2) 



The proof of Theorem 1 (see [5] ) constructs recursively checkpointing schedules 
that attain the minimal number Tb. For the optimal checkpointing procedures 
the positions of the checkpoints are given by binomial coefficients. This fact 
explains the name binomial checkpointing. Furthermore, the proof of Theo- 
rem 1 shows that each time step Fi is evaluated at most r times. Hence, r 
has the same meaning as in the previous section. It was proved earlier that a 
logarithmic growth of memory and run-time can be achieved using binomial 
checkpointing by providing an appropriate number of checkpoints [4] . 

The routine revolve implements the optimal binomial checkpointing and 
can be incorporated easily in an existing adjoint calculation [2, 8]. Moreover, 
one can build a heuristic based on revolve such that the adjoint calculation 
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using binomial checkpointing becomes applicable also if the number of time 
steps is not known a-priori, e.g. due to adaptive time stepping, and/or if the 
temporal complexity of the time steps is not constant, e.g. due to implicit 
methods [9]. 

One optimal checkpointing schedule computed with revolve for N = 16 
time steps and c = 3 checkpoints is shown in Fig. 3. Once more, it might be 
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necessary to evaluate Fpj- once to initialize the adjoints. Since the situation is 
the same for the multi-level checkpointing and does not influence the results 
in the sequel, we ignore throughout the evaluation of Fj^. For the example 
shown above, the number of time step evaluations equals Tr = S3. Compared 
to the two-level checkpointing, the computing time for the adjoint calculation 
increases by less than 50 %. Furthermore, only 3 states have to be kept in 
memory. Hence, the storage requirement is reduced by 50 %. The relation 
between the two checkpointing approaches will be discussed in more detail in 
the next section. 

4 Comparison of Both Checkpoint Distributions 

The integer r has the same meaning for both checkpointing approaches, namely 
the maximal number of times any particular time step Fi is evaluated during 
the adjoint calculation. Hence for comparing both approaches, assume at the 
beginning that r has the same value and that the same amount of memory is 
used, i.e. Mr = Mb = c. 

Now, we examine the maximal number of time steps N* for which an 
adjoint calculation can be performed using the two approaches. Assuming that 
r is a divisor of c and Mr — c, one obtains the identity 

iv;=(^ + l) = with = i = l,...,r, 

for the uniform checkpoint distribution because of the structure of Nr. Theo- 
rem 1 yields 
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for the binomial checkpoint distribution. Obviously, one has 
- 4- 1 < — ^ + 1 for0<i<r — 1. 



These inequalities yield if r > 2. Hence for all r > 2 and and a given 

c, binomial checkpointing allows the adjoint calculation for a larger number of 
time steps compared to uniform checkpointing. In more detail, using Stirling’s 
formula we obtain 




Hence, the ratio of and W* grows exponentionally in r without any de- 
pendence on the number of available checkpoints. Fig. 4 shows and for 
the most important values 2 < r < 5. Since r denotes the maximal number 



Maximal N for r=2 



Maximal N for r=3 




#checkpoints 
Maximal N for r=4 

80000 
70000 
60000 
50000 
^ 40000 

30000 
20000 
10000 
0 

0 5 10 15 20 25 30 35 40 45 50 




#checkpoints 
Maximal N for r=5 

80000 
70000 
60000 
50000 
^ 40000 

30000 
20000 
10000 
0 

0 5 10 15 20 25 30 35 40 45 50 





#checkpoints 



#checkpoints 



Fig. 4. Nr and for r - 2, 3, 4, 5 



of times each time step is evaluated, we have the following upper bounds for 
the number of time steps evaluated during the adjoint calculation using r- level 
checkpointing and binomial checkpointing, respectively: 
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Tr = c(^^ + l)’’"' < tN; and Tb = tN; - < rN^ • 

For example, it is possible to compute the adjoint for N = 23000 time steps 
with only 50 checkpoints, less than 3A^ time step evaluations, and N adjoint 
steps using binary checkpointing instead of three-level checkpointing, where 
< 5515. If we allow 4.N time step evaluations then 35 checkpoints suffice 
to compute the adjoint for 80000 time steps using binomial checkpointing, 
where < 9040. These number are only two possible combinations taken 
from Fig. 4 to illustrate the really drastic decrease in memory requirement 
that can be achieved if binomial checkpointing is applied. 

However, usually the situation is the other way round, i.e. one knows N 
and/or c and wants to compute the adjoint as cheap as possible in terms of 
computing time. Here, the first observation is that r-level checkpointing intro- 
duces an upper bound on the number of time steps the adjoint of which can 
be computed, because the inequality N < {c/r -\-iy must hold. Furthermore, 
binomial checkpointing allows for numerous cases also a decrease in run-time 
compared to the uniform checkpointing. For a given r-level checkpointing and 
Mr = c, one has to compare Tr and T^. Let be the unique integer satisfying 
(2). Since at least one checkpoint has to be stored at each level, one obtains 
the bound r < c. I.e., one must have c >= log 2 (A') to apply uniform check- 
pointing. Therefore, the following combinations of r and are possible for the 
most important, moderate values of r: 



r = 3 r^ G {2, 3}, r = 4 r^ G {3, 4}, r = 5 r^ G {3, 4, 5} . 



For 3 < r < 5, one easily checks that Tr > Tb holds if ri, < r. For r — rb^ one 
can prove the following, more general result: 

Theorem 2. Suppose for a given N and a r-level checkpointing with Mr — c 
that the corresponding rb satisfying (2) coincide with r. Then, one has 



T 2 = 2N — c — 2 = Tb if r = rb = 2 
Tr > Tb if r = rb> 2. 

Proof: For rb = r = 2 the identity T 2 = Tb is clear. For r — r^ > 2, the 
inequality 



T + 1) - L _ IV I + 1) + XI ii(^j + 



r— 1 



1 — 1 r — 1 



i=l J=i 



(r-1) 



I j = l 



< 



(r 






i=l 3=1 



c + r 
r — 1 



holds. Using the definitions of Tr and Tb, this relation yields immediately 

Tr > Tb. ■ 
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Hence, except for the case r = Vh — 2^ where Tr and Tb coincide, the run-time 
caused by binomial checkpointing is less than the one caused by multi-level 
checkpointing if r = 



5 Conclusions 

This article discusses several checkpointing techniques, namely multi-level 
checkpointing and binomial checkpointing. A detailed analysis of the num- 
ber of time steps the adjoint of which can be calculated, the run-time needed 
for the adjoint calculation and the memory requirement is given. 

One can conclude that binomial checkpointing allows adjoint calculations 
with a surprisingly small fraction of the memory needed by the basic approach. 
This storage reduction causes only a very moderate increase in run-time. On 
the other hand, we see that r-level checkpointing induces for a given number of 
checkpoints an upper bound on the number of time steps the adjoint of which 
can be computed. This upper bound can only be increased by introducing a 
next level of checkpointing. In addition it is shown that the run-time required 
for the adjoint calculation with r-level checkpointing exceeds the run-time 
needed for binomial checkpointing for the most important values of r > 2, 
whereas for r = 2 both methods yield the same run-time. However, for r = 2 
and a given amount of memory, binomial checkpointing allows the adjoint 
computation for a larger number of time steps. Hence, even for r = 2 binomial 
checkpointing is preferable. 

Moreover, it is quite often the case that the number N of time steps is not 
known a-priori, for example due to an adaptive time stepping method. Then, 
it becomes difficult to distribute the checkpoints for the two- or multi-level 
checkpointing such that the minimal run-time is attained. For binomial check- 
pointing the extension a- revolve deals with the unknown number of time steps 
by using a heuristic for the checkpoint placements. In addition, a-revolve can 
also handle time steps with varying temporal complexity. For time steps the 
cost of which do not change drastically, the heuristics implemented in a-revolve 
work well such that the corresponding adjoint calculation is only a few percent- 
ages slower than the one based on revolve [9]. Hence, binomial checkpointing 
provides memory-reduced adjoint calculation also in more general situations. 
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Summary. This paper uses the fictitious boundary method described in [1] for the 
solution of incompressible fiow with moving rigid bodies in complex geometries. The 
method is based on a special treatment of Dirichlet boundary conditions inside of 
a FEM approach in the context of a hierarchical multigrid scheme such that the fiow 
can be efficiently computed on a fixed computational mesh while the solid boundaries 
are allowed to move freely through the given mesh. In this paper, we focus on the 
calculations of the drag and lift forces acting on the moving solid bodies which are 
not captured by the mesh. The comparison between the present and benchmark 
results for the fiow around a circular cylinder with different Reynolds numbers is 
first presented, and then the result for a circular cylinder oscillating in a channel 
is given. The simulation results compared with corresponding reference results are 
found to be very reasonable and satisfactory. 



1 Introduction 

Incompressible flow problems with moving rigid bodies in complex geometries 
have drawn attention of numerous investigators. Their studies have been mo- 
tivated by the desire to understand the fundamental physics of such flows as 
well as their practical importance in various areas. The phenomena of such flow 
problems are visible everywhere around our living environments such as: flow 
around high-rise building, the drag force induced by driving car accelerating in 
the wind, ocean current interaction with the offshore structures, sedimentation 
flow in estuary and sand flow in desert, etc. 

From the numerical point of view, incompressible flow with moving rigid 
bodies in complex geometries is quite hard to simulate, since it can require 
a huge amount of time for the generation or deformation of the computational 
grid when the corresponding boundaries are complex or changing. Such prob- 
lems have motivated the development of numerous algorithms, which can be 
broadly classified into two families. One of them is a ‘body-conformal approach’ 
which always keeps the computational mesh in accordance to the geometrical 
details [2, 3]. Another one is a ‘fixed grid approach’ in which case the mesh 
is (arbitrarily) fixed and internal objects are allowed to move freely through 
the mesh [4, 5]. One big advantage of such ‘fixed grid approaches’ over the 
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conventional ‘body-conformal approaches’ is that the computational mesh re- 
mains unchanged such that the CPU cost per time step can be significantly 
decreased - less computational effort due to saving the expensive mesh genera- 
tion - and that such techniques can be easily incorporated into standard CFD 
codes which mostly allow fixed computational grids without local adaptivity 
only; however, the resulting accuracy is not clear. Therefore, the overall aim 
is to deal successfully with the moving boundaries such that the accuracy of 
the numerical approximation is sufficiently high while at the same time also 
the computational cost is decreased. 

In the spirit of the ‘fixed grid approaches’, a simple and efficient ‘fictitious 
boundary method’ for the detailed simulation of incompressible flow with com- 
plex geometries and/or moving interfaces was developed in the paper [1]. The 
method is based on a fixed unstructured FEM background grid. It starts with 
a coarse mesh which contains already many of geometrical fine-scale details, 
and employs a (rough) boundary parametrization which sufficiently describes 
all large-scale structures with regard to the boundary conditions. Then, treat 
all fine-scale features as interior objects such that the corresponding compo- 
nents in all matrices and vectors are unknown degrees of freedom which are 
implicitely incorporated into all iterative solution steps (see [1]). 

In this paper, we used the fictitious boundary method for the simulation of 
incompressible flow with moving rigid bodies in complex geometries. In many 
cases, the calculation of forces acting on the moving rigid bodies is very im- 
portant for the further study of the interaction between fluid and body, like in 
particulate flow, sedimentation flow, and fluid-structure flow, etc. However, in 
the fictitious boundary method, it is not so easy and straightforward to com- 
pute these interesting forces, because the drag coefficient Cd and lift coefficient 
Cl acting on the moving solid bodies are a very delicate quantity: they include 
the results directly on the wall surface of the moving rigid bodies which is rep- 
resented implicitly in the fictitious boundary method due to the use of a fixed 
grid rather than a body-conformal grid. Therefore, the integral of forces only 
over the wall surface of rigid bodies cannot be implemented directly in the 
fictitious boundary method. For overcoming this difficulty, a volume integral 
instead of the conventional surface integral for the calculation of the Cd and 
Cl by introducing an auxiliary function [7] or two additional functions [8] is 
suggested. Obviously, in such volume integral calculations, the reconstruction 
of the wall surface of the moving rigid bodies can be avoided. In this paper, 
we use the Duchanoy’s idea of the volume integral [7] , and expand his imple- 
mentation in a finite volume method into the finite element method and the 
fictitious boundary method. 



2 The fictitious boundary method 

The details of the fictitious boundary method have been described in [1] . For 
the following considerations, let i? be a bounded domain with a piecewise 
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smooth boundary F. The equations to be solved are the incompressible Navier- 
Stokes equations 

p ^ + pu • Vu = f - Vp + /iV^u, V-u = 0, (1) 

where u is the velocity, p the pressure, /i the dynamic viscosity coefficient, 
p the density, f the source term which may include the gravitational force. 
The above equations are to be solved with u(x, t) = ua(x, t) on parts of 
the boundaries of the flow domain where u^(x, t) is the prescribed boundary 
velocity, including time-dependent moving boundaries. The details of solving 
such incompressible flow problems can be found in the FeatFlow software [10, 
11] which is based on (nonconforming) FEM discretizations, adaptive implicit 
time- stepping, nonlinear Newton-like methods, (geometrical) multigrid solvers 
(for velocity and pressure separately) on quite arbitrary domains. 

In the following part, we give the description of a volume integral approach 
for the calculation of the drag coefficient Cd and lift coefficient Ci acting on 
the moving solid bodies. Let S be the wall surface of the rigid bodies, ns 
be the inward pointed unit normal with respect to i? and tangential vector 
r = (uy, —Ux). The drag and lift forces are usually calculated by a surface 
integral as follows 



Fd 




ds , 



Fl = - 




d\lr 

^ 

dns 



-puy 



ds , 



( 2 ) 



while the drag and lift coefficient are calculated via 



Cd 



2Fd 

pU'^D ’ 



Cl 



2Fl 

pmD ’ 



( 3 ) 



where U is the characteristic velocity, and D the characteristic length. 

From Eq.(3) and Eq.(2), we can see that the surface integral around the 
wall surface of the rigid bodies should be conducted for the calculation of the 
Cd and Cl. However, in the present fictitious boundary method, the shapes 
of the wall surface of the moving rigid bodies is implicitly imposed in the 
fluid field. If we reconstruct the shapes of the wall surface of the moving rigid 
bodies, it is not only a time consuming work, but also the accuracy is only first 
order due to a piecewise constant interpolation. For overcoming this problem, 
we use the following method to calculate the Cd and Q in which the surface 
integral is replaced by a volume integral. We define a parameter a as 



a 




1 

0 



for X G , 

for X G f? , 



( 4 ) 



where x denotes the coordinates of the edge midpoints of cells, f?c is the 
domain occupied by the rigid bodies, Q is the fluid domain, the whole domain 
is Qt = 12 U i?c- The importance of such a definition of the parameter can be 
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seen from the fact that the gradient of ol is zero everywhere except at the wall 
surface of the rigid bodies, and equal to the normal vector n defined on the 
grid [7, 12], i.e. 



n == — Vo; . (5) 

The total stress tensor a of the fiuid flow is 

G = -pi -f II [Vu + (Vu)^] . 

Hence the forces acting over the wall surface of the rigid bodies 
computed by 

Yt= Gndf] = — aVadQ. 

j Q'J' j Qrp 

The drag force and lift force can be obtained from the Eq.(7), 
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f duda 




du da\ 
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(8) 
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J Qt 




^dx dx 
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dy dy) 




dQ . 


(9) 



Therefore through Eq.(8), Eq.(9) and Eq.(3) we can calculate the new 
drag and lift coefficients {Cd and Ci) via the volume integral over the whole 
domain Qt instead of the surface integral over the wall surface of the rigid 
bodies in Eq.(2). The integral over each element covering the whole domain 
Qt is evaluated with the standard 3x3 point Gaussian quadrature. Since the 
gradient Va is non-zero only at the wall surface of the rigid bodies, thus the 
volume integrals need to be computed only in one layer of mesh cells around 
the rigid bodies. It is convenient for the present fictitious boundary method to 
calculate the Cd and Ci. 



( 6 ) 
can be 

(7) 



3 Numerical tests 

This section consists of two parts. The first part presents a quantitative exam- 
ination for the benchmark case of flow around a circular cylinder with Re = 20 
and 100 solved by the present fictitious boundary method. The second part 
gives the computing results for a circular cylinder oscillating in a channel. For 
comparison, corresponding reference results are also presented. 
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3.1 Flow around a circular cylinder 

We consider the benchmark case of flow around a circular cylinder described 
in the paper [9]. The body-conformal mesh of Fig. 1 (a) is used to provide 
reference results, while the channel meshes of Fig. 1 (b) and (c) are employed 
by the present fictitious boundary method. The channel height is H — 0.41 
m, the cylinder diameter D = 0.1 m. The Reynolds number is defined by 
Re = iJDjv with the mean velocity \J = 2[/(0, if/2, t)/3. The kinematic 
viscosity of the fluid is given by v — /i/p = 10~^m^/s and its density by 
p = Ikglm^. The inflow profiles are parabolic with different U such that 
the resulting Reynolds numbers are Re = 20 (steady case) and Re — 100 
(nonsteady case). 






(c) channel mesh II (LEVEL = 2) 
Fig. 1. Different coarse meshes 



Table 1. The number of elements for different refined meshes 



LEVEL 


3 


4 


5 


6 


7 


body-conformal mesh 


384 


1536 


6144 


24576 


98304 


channel mesh I 


1088 


4352 


17408 


69632 


278528 


channel mesh II 


416 


1664 


6656 


26624 


106496 



We first perform a stationary simulation {Re = 20), based on the body- 
conformal mesh, the channel mesh I and the channel mesh II, respectively. 
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The shown coarse meshes will be successively refined by connecting opposite 
midpoints. Table 1 gives the number of elements for these meshes after such 
global refinements. Here LEVEL corresponds to the number of refinements. 
The following Table 2 shows the comparison of the drag coefficient Cd and 
the lift coefficient Ci based on the body-conformal mesh, the channel mesh I 
and the channel mesh II, respectively. The calculation of Cd and Ci based on 
the body-conformal mesh uses the surface integral formula in Eq.(2) which is 
referred as reference results, while for the cases of using the channel mesh I 
and the channel mesh II, the volume integral formula in Eq.(8) and Eq.(9) are 
employed. In this table, the results calculated from LEVEL = 3 to LEVEL = 7 
are all shown together. The corresponding benchmark values in [9] are also 
listed in the table. From the comparisons, it can be seen that the results 
calculated by the present fictitious boundary method agree sufficiently well 
(~ 1%) with both the reference results. The results for the channel mesh I are 
found to be not completely satisfying, while the results for channel mesh II are 
improved since there is local refinement of the mesh near the wall surface of the 
cylinder. The results for such low Reynolds number simulations show that an 
appropriate global grid refinement as well as adequate local mesh adaptation 
are necessary. The present fictitious boundary method proves to be competitive 
with the standard approaches for such typical CFD applications. 



Table 2. Comparison of Cd and Ci for Re — 20 





Cd 


Cl 


LEVEL 


Ref. 


ch. mesh I 


ch. mesh II 


Ref. 


ch. mesh I 


ch. mesh II 


3 


0.53450D+01 


0.55296D+01 


0.54196D-h01 


0.56128D-02 


0.12165D-01 


0.24435D-02 


4 


0.55066D-1-01 


0.53537D+01 


0.54207D+01 


0.84683D-02 


0.10742D-01 


0.67612D-02 


5 


0.55404D+01 


0.54278D+01 


0.55161D+01 


0.98915D-02 


0.61455D-02 


0.89128D-02 


6 


0.55581D+01 


0.55012D+01 


0.55571D+01 


0.10384D-01 


0.99024D-02 


0.94709D-02 


7 


0.55683D+01 


0.55421D+01 


0.55640D+01 


0.10554D-01 


0.97706D-02 


0.10192D-01 


Ref. [9] 


0.55795D+01 


0.10618D-01 




Fig. 2. Periodical results of Cd and Ci for Re = 100 
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Next, we further examine the resulting accuracy for a medium range 
Reynolds number Re = 100 which leads to periodical time-dependent vor- 
tex shedding behind the cylinder. Since we are mainly interested in the spatial 
accuracy of the fictitious boundary method, in particular capturing the impor- 
tant effects near the cylinder, we try to eliminate the temporal discretization 
error by choosing very small time steps. Then, we proceed the nonstationary 
simulations until a fully periodical fiow behaviour of all quantities has been 
observed. Finally, we compare the results for one period. Figure 2 shows the re- 
sults of the Cd and Ci . The ” Standard L = 7” means that the body- conformal 
mesh of Fig. 1 (a) with 7 level refinement was used to provide the reference 
result, while ”FB L = 3 ~ 7” represents the results obtained by the present 
fictitious boundary method using the channel mesh I in Fig. 1 (b) with differ- 
ent refinement levels. From these figures, we can see that the various results 
are identical with regard to the reference results. The results also show that 
the present fictitious boundary method leads to comparative results like the 
standard approaches with ‘body-fitted’ meshes. It can also be claimed that 
results with higher accuracy can be reached via local mesh adaptivity (see for 
example the concept of aligned adaptive computational meshes in [1]). 

3.2 An oscillating cylinder in a channel 

To demonstrate the ability of the present fictitious boundary method to han- 
dle flows with complex moving boundaries, we have chosen a flow configura- 
tion with a cylinder undergoing sinusoidal transverse oscillation in a channel 
with specified amplitudes and frequencies. The channel mesh of Fig. 3 (a) is 
employed by the present fictitious boundary method. The computational do- 
main size is (2.2 x 0.41). The mean location of the cylinder center (Xo,To) is 
(1.1, 0.2) relative to the left bottom corner of the domain. The cylinder diam- 
eter D is equal to 0.1. No-slip is prescribed on the left, right, top and bottom 
boundaries. The cylinder is oscillating sinusoidally such that the location of 
its center (Xc,Tc) is given by {Xc{t) = Xq ^ Asin(27r/t), Yc{t) — Iq), where 
t is the time, and A = 0.25 and / = 0.25 are amplitude and frequency of 
the oscillation, respectively. The kinematic viscosity of the fluid is given by 
jy — jd/p = 10~^ rn?/s and its density by p = 1 kgjw?. The fluid in channel 
is initially at rest. Since there is no benchmark result available for compari- 
son, we carried out a reference calculation to provide comparing data. In the 
reference calculation, the body-conformal mesh of Fig. 3 (b) is used, we fix 
the cylinder but set the coordinate system moving with the same motion but 
with opposite moving direction of the moving cylinder in the calculation of 
the fictitious boundary method. Table 3 gives the number of elements for the 
channel mesh and body-conformal mesh in Fig. 3 with different numbers of 
refined levels. 

Fig. 4 gives contour plots for the vorticity distribution obtained by the 
fictitious boundary method based on the channel mesh. These pictures show 
that the flow in the channel is disturbed by the oscillating cylinder, and the 
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(a) channel mesh (LEVEL = 1) (b) body-conformal mesh (LEVEL = 2) 

Fig. 3. Coarse meshes used for the oscillating cylinder in a channel 



vortex is generated periodically in the wake of the cylinder. The range of 
wakes becomes longest when the cylinder is at the end of the moving direc- 
tion (t = to |T, to + |T, T is time period), while when the cylinder is in 
the middle position of its oscillation, the flow is seriously perturbed and be- 
comes more complex {t = to, to + jT). Fig. 5 illustrates the comparison of the 
drag coefficient Cd and lift coefficient Ci between the results of the fictitious 
boundary method based on the channel mesh and the reference calculation 
based on the body-conformal mesh. The results calculated from LEVEL = 4 
to LEVEL = 7 are all shown together. The corresponding coefficients Cd and 
Q for one period between t = 19.79 to 23.79 are shown in Fig. 5 (c) and (f), 
the solid line represents the results of the reference calculation based on the 
body-conformal mesh at LEVEL == 7, while the dash line denotes the results 
obtained by the fictitious boundary method based on the channel mesh at 
LEVEL = 7. From the comparisons, we can see that both FB and Ref. results 
are identical with the increase of the mesh refinements. The FB results calcu- 
lated by the present fictitious boundary method are agreeable very well with 
the reference results, although the FB results exhibit small oscillations due to 
the non-aligned cylinder movement through the grid lines. 



Table 3. The number of elements for different refined meshes 



LEVEL 


4 


5 


6 


7 


channel mesh 


8448 


33792 


135168 


540672 


body-conformal mesh 


1792 


7168 


28672 


114688 




(c) t = to + \T 



(d) t — to + 



Fig. 4. Vorticity contour plot for an oscillating cylinder in a channel 
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Fig. 5. The comparison of Ca and Ci between fictitious boundary (FB) and reference 
(Ref.) 



4 Conclusions 

The presented fictitious boundary method for simulating incompressible flows 
with moving rigid bodies in complex geometries has been validated in a two- 
dimensional conflguration. We showed that the use of an arbitrarily fixed (un- 
structured) FEM background mesh is accurate enough to calculate those sen- 
sitive quantities (drag coefficient and lift coefficient) on the wall surface of the 
cylinder . Comparisons of the results using a body-fitted mesh and a fixed 
structured mesh show good agreement. The advantage of the present method 
is that since the body motion is independent of the mesh, problems associated 
with mesh reconfiguration and motion are avoided, computations on a fixed 
grid are cheaper than on a body- fitted one, and finally, the extension of the 
method to 3D is straightforward. It is also worthy to note that the availability 
of the present method to accurately compute the forces acting on the moving 
rigid bodies provides a good and solid base for further study of particulate flow 
as well as the interaction between fluid and structure as proposed by Glowinski 
in the paper [4]. 
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Summary. Initial-boundary value problems for systems of nonlinear parabolic par- 
tial differential equations arise in many important practical applications in electro- 
magnetics, chemistry, modelling of diffusion and heat transfer processes and other 
fields. We are concerned with their solution by means of the method of lines with 
higher-order finite element spatial discretization on unstructured triangular meshes. 
Obviously, development of realistic a-posteriori error estimates plays an essential role 
in the application of a strategy of this type. 



1 Introduction 

Initial-boundary value problems for systems of nonlinear parabolic partial dif- 
ferential equations (PDE’s) arise in many important practical applications in 
electromagnetics, chemistry, modelling of diffusion and heat transfer processes 
and many other fields^. 

We are concerned with the numerical solution of such problems by the 
method of lines (MOL) combined with fully automatic /ip-adaptive finite el- 
ement (FE) discretization on unstructured triangular meshes in space. This 
approach has the potential of reducing the size of discrete problems signifi- 
cantly while preserving the accuracy of results. 

Until now, automatic /ip- adaptivity has been applied almost exclusively to 
stationary problems (see, e.g., [1, 4, 2] and references therein). It is our aim 
to extend the promising automatic /ip-adaptive strategies for elliptic problems 
[7, 8] to parabolic equations and in this paper we present two basic steps to- 
wards this goal: 



— Efficient implicit time-adaptive higher-order FE solver PARSYS_2D for sys- 
tems of nonlinear parabolic PDE’s with general boundary conditions for all 
solution components. 

^ This work was supported by the Grant Agency of the Czech Republic under 
projects No. GP102/01/D114 and 201/01/1200. 
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— A-posteriori error estimates appropriate for the class of evolutionary prob- 
lems studied. 



2 Definition of the problem 

Let i? C be a bounded domain with piecewise-polynomial Lipschitz- 
continuous boundary and J (0, T] a finite time interval. We consider a sys- 
tem of Neq nonlinear parabolic equations 

ii{x, t) — V • (a(w, V-u, X, t)Vu{x, t)) 

u{x^ 0) 

Ui{x,t) 

ai{u^ Vtx, aj, t)duildn 

{u stands for dufdt) where u = (wi, . . . ,'UiVeq) is the solution, a and / are 
smooth vector- valued functions, n is the unit outward normal to the bound- 
ary and dQ — PP U for all i — l,...,Aeq* Further we denote g = 
(pi? • • • 5 pATeq)* The V (nabla) operator is defined as usual, V = {d/dxi^ d ldx< 2 ). 
The vector- valued coefficient a — (ai, . . . , aAr^^) is bounded, 

0</i<a^(.)<M, !<2<A^eq, (2) 

and both a and / are Lipschitz-continuous, 

||a(r)-a(s)|| <L||r-s||, (3) 

ll/(’’) - /(«)ll < - s|| Vr,s € (4) 

Without loss of generality we restrict ourselves to homogeneous Dirichlet 
boundary conditions for the formulation of the variational problem. In the 
nonhomogeneous case, an appropriate vector- valued lift function is chosen that 
yields an additional contribution to the right-hand side (see, e.g., [8] for de- 
tails). The variational form of (1) reads 

{ii(x, t), (^)-h(a(tx, Vix, X, t)Vu{x, t), V(p) 

= (/(«, Vu, X, t),(f) + {g{x, t),(f)rN, \/(p£V ( 5 ) 

u{x, 0) = v(x), 

where t G (0,T) and the form of the space V C is dictated by PP, 

1 <i < Neq. The symbol (., .) stands for the scalar product. 

Following the concept of MOL, we discretize the spatial variable first and 
leave the temporal variable continuous. Consider a finite element mesh cov- 
ering f?, where a polynomial order p{Ki) > 1 is associated with each element 
Ki G 1 < ^ < Neiem- Let V^p{f2) be an appropriate piecewise-polynomial 
subspace of V. We pose the semidiscrete problem to find Uh,p € 
alH G J, such that 



= f{u, Vn, cc, t) in i?, t e J, 
= v{x) in i?, 

= uf{x,t) onrp, 

= 9 i{x,t) on rA, 
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= {f{uh,p, Vuh,p, X, t), x{x)) + {g{x, t), x{x))rN , (6) 

for all X € Vhpi^) t G J. Here, Vh,p € V^pi^) i® ^he iJ^-projection of 
V G V(Q). 



3 Finite-dimensional spaces 

Let Th^p be a mesh on i? consisting of A^eiem disjoint triangles Ki^ z — 
1, . . . , A^eiem- The polynomial order p{Ki) > 1 is assigned to each element 
Ki. Local polynomial orders on edges are determined using the minimum rule 
(minimum of polynomial orders on adjacent elements). Polynomial orders on 
all elements are collected into a vector p — {p(iXi),p(iX2), . . . ,p(^iVeiem)}- 
further define 

yh,pfi{^) = e <Pk\K. e Vp^xM^i), f = 1, . . . , Nelem, 

V>k\ej ^Vp(^ei){ej); j = l,...,Ne; k = 1, . . . ,Neq}- (7) 

By q( 17) we denote the space corresponding to a uniform distri- 

bution of the polynomial order p = {q,g, • • • ,q}- 

The design of hierarchic basis functions of o(^) well-known (see, e.g., 
[8]). The basis consists of vertex functions (associated with mesh vertices), edge 
functions (associated with edges where p{e) > 2) and bubble functions (asso- 
ciated with element interiors where p{K) > 3). To simplify the explanation, 
we introduce the following notation: 

— BZ 1 . . . the set of all vertex functions in the basis, 

— p,q ’ • • ^Le set of all edge functions of the polynomial order g, 

— B^^ q . . . the set of all bubble functions of the polynomial order q, 

— Bei . . . the set of all edge functions associated with the edge , 

— BKi • • • the set of all bubble functions associated with the triangle Ki. 



4 A-posteriori error estimation 

We extend a technique for a-posteriori estimation that was first proposed in 
[3] and further developed in [5] (in both cases for ID problems). For the sake 
of clarity we restrict the explanation to the scalar case (Aeq — 1) with ho- 
mogenous Dirichlet conditions on the whole boundary 5i7, with a and f only 
depending on the exact solution u. 

The error of the solution to the semidiscrete problem is defined as usual, 

c(ic, ^) — u{x^ ^) U}i_^p{x ^ ^). (8) 
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By Uh,p we denote the elliptic projection of the exact solution to the space 
o(^) (i-®- Q = P)' The main idea of the error estimation ap- 

proach can be outlined as follows: The problem is solved with Uh,p G q{^)- 

Then the problem is solved once again with Uh,p G o(‘^) (^•®- ^hpo(^) with 
q = p + 1). The estimate is based on the difference Uh,p — Uh,p. 

Let us begin with the identity 

(e, x)+{a{uh,p+e)We, Vx) = (f(uh,p+e), x)~(uh,p, x)~(a(uh,p+e)Vuh,p, Vx) 

( 9 ) 

for almost every t G J and all functions x € Hq, that follows directly from (8), 
(5) and (6). Now our aim is to introduce an easily computable error estimate 
E{x,t) that should be close to the quantity e. 

Definition 1. Let us define the space 

We define three estimates -EpN, EpL and Eel as functions associated with 
^/fp o(‘^) means of the identity (9). 

Definition 2. We say that a function E = J5pn is a nonlinear parabolic error 
estimate if the identity 

x) + {a{uh,p + E)VE, Vx) 

= {f{uh,p + E), x) - (uh,p, x) - {a{uh,p + E)S7uh,p, Vx) (10) 

holds for almost every t G J and all functions x S o(^)’ */ identity 

(a{v)VE, Vx) = (a(v)V(n - Uh,p), Vx) (11) 

holds for t = 0 and all functions x £ ^/Tp o(^)- 



Definition 3. We say that a function E = E'pl (or E = E-^h) is a linear 
parabolic (elliptic) error estimate if the identity 

(-S'? x) T VE*, Vx) = (f ('^h,p)^ x) ~ ('^h,p') x) ~ (^('^h,p)^'^h,p^^x) (12) 

or 

(a(uh,p)y7E, Vx) = (f(uh,p), x) - (uh,p, x) ~ (a(uh,p)y7uh,p, Vx) (13) 

holds for almost every t £ J and all functions x G TJfp o(12), and if identity 
(11) holds for t — 0 and the same functions x- 

Further, let us introduce a function eh,p G T'/fp o(l^) such that Uh,p + ^h,p is 
the elliptic projection of u to F^p o(12). 
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Definition 4. By eh,p{x^t) we denote a function in ^po(^) that 

{a{u)'V{uh,p + eh,p),^x) = (a(w)VM,Vx) ( 14 ) 

holds for almost every t € J and all functions x € o(^)- 

Definition 5. By 6 we denote the difference 



0{x^f) — ^h,p(x^t^ Uji^p(^X^tf ( 1 ^) 

Further we define functions ry = ^pn,^pL 7 ^el for E = -SpN, ^pl, -Sel such 
that 

rj{x, t) = eh,p(x, t) - E{x, t). (16) 

The following proposition compares our estimate with the quantity By 
||.||r we denote the H^{Q) norm. 

Lemma 1. Let Ep^ G o(‘^) error estimate given by (10) and 

(11)^ and Ch,p G V^^p^o(f^) be the function defined by (14). Let ||^(.,t)|| and 
||^pn (-7 0II nondecreasing functions of the variable t and let 

||r?PN(.,0)|| (17) 

Let Ch^p and Ep^si depend on t in a sufficiently smooth way. Then there exists 
a constant C > 0 such that 



WvpnWi <ChP+\ ( 18 ) 

Remark 1. Analogous propositions can be shown also for ||?7 pl||i and ||r/EL|li 
with minor differences in the proof. 

Proof. We start with the formula 

{E, x) + {a{uh,p + E)VE, Vx) + {Uh,p, x) + {a{uh,p + E)Vuh,p, V£) 

- {uh,p, x) - (e/j.p, x) - (P, X) - (a{u)V{uh,p + eh,p), Vx) (19) 

= {f{uh,p + E), x) - if{u), x), 

that follows from (5), (14) and (10), and use 

p{x, t) = u{x, t) - Uh^p{x, t) - eh,p{x, t). (20) 

By rearranging terms, substituting rj G ^/fp o('^) x? applying the Schwarz 
inequality and using the fact that 

||//i,pu;-rc||-f /i||V(//i,p'u;-u;)|| < C/i^+^||u;||p+i when ic G HiJo , (21) 
where Ih,p is the interpolation operator from Hq into arrive to 

~\\vr + ^llVr/ll^ < Cr(t)\\n\\^ + C2(i)^2p+2 (22) 
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where S > 0. We omit the term with HVryp on the left-hand side. Integrating 
on both parts of (22), we find 

< \mf + C3{t)h^^+^ + 2 fc,is)Hs)fds (23) 

Jo 

if the corresponding functions are Lebesgue integrable on the interval [0,T]. 
Employing now (17), assuming that the corresponding functions are again 
Lebesgue integrable on [0,T], and applying GronwalPs lemma, we obtain the 
bound 

\\v{t)f < G4(t)/l2P+2 (24) 

Turning now back to the inequality (22) and assuming that ||r 7 || is nondecreas- 
ing, we can write 

^Nll">0. (25) 

Using (24) and (25), we finally obtain from (22) that 

llVr/f < (26) 

The statement of the lemma follows directly from (24) and (26). □ 



For each of the three above error estimates let us introduce an effectivity index 
0PN, 0pL, and 0 EL 5 respectively, defined as 



9 = 




(27) 



The following theorem contains a principal result related to lim/i_,o 9: 



Theorem 1. Let u{t) G Pi Hq and Uh^p{t) G y^poi^) solutions 

of (5) and (6) for all t G J. Let E G o(12) be the solution of (10), (11) 
(for Ep^), (11); (12) (for Epp), or (13) (for Epp). Then, under appropriate 
regularity assumptions, 



lim 6) = 1 (28) 

h-^o 

for almost every t G J, where 9 is 0pN; 6 >pl or 0el- 

Proof The proof is rather technical (see [10]). Its main idea is the same as in 
the ID case presented in [5]. □ 



5 Numerical results 

5.1 Model problem 

To verify the theoretical results, we set up a model problem (1), where we 
chose the analytical solution as 
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u{x, t) = cos(50(xi 4- X 2 ) — t)xiX 2 {l — xi){l — X 2 ) (29) 

and the coefficient a = a{s) = 1 + 5, s G R. The right-hand side function / 
was calculated to yield the exact solution u. 

The values of 6 >el and 0 pl for t — 0.4, h — y^l3/36 and for different 
p’s and mesh parameters are summarized in Tables 1 and 2. The implicit 
Euler formula with the step At — 0.02 has been used for the time integration. 
According to Theorem 1, the values should tend to 1 as /i ^ 0. 





P = 1 


II 

to 


II 

CO 


II 


h 

h/2 

h/4 

h/8 


1.01140677283 

1.00251576073 

0.99790790301 


4.09651988491 

1.00652321679 

0.99884836859 

0.99587534487 


2.68370145897 

0.98450775723 

0.99764998322 

0.99996695628 


2.07009170501 

0.96472314783 

0.99629420488 

1.01800491423 



Table 1. Values of Gel for different /I’s and p’s. 





p = 1 


p = 2 


II 

CO 


II 


00 to 


1.00953184295 

1.00221228302 

0.99774286135 


4.08160620473 

1.00589904628 

0.99868374826 

0.99582769340 


2.67782363272 

0.98419486064 

0.99752249029 

0.99993253125 


2.06631766115 

0.96447454547 

0.99620694415 

1.01798017133 



Table 2. Values of 6 >pl for different /I’s and p’s. 



6 Brief description of PARSYS_2D 

During the first stage of the project we implemented a robust higher-order 
FE solver PARSYS_2D for systems of nonlinear parabolic PDF’s in 2D analo- 
gously as we did in ID (see the PARSYS package [6]). Implicit time-integration 
methods allow the solver also to solve systems of elliptic PDF’s after leaving 
out the time- derivative. We use our software XGEN (see [9]) to generate un- 
structured triangular meshes of very good quality on domains with arbitrarily 
complicated geometries. All the software is available in Internet^. 

^ The mentioned C-|— h software packages PARSYS, PARSYS_2D and XGEN 
can be downloaded free of charge from the web page of Pavel Solm, 
http : / /www . caam . rice . edu/“'solin. 
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6.1 Definition of the solved system of PDE’s 

It is not trivial to find an efficient and user-friendly way to define a system of 
non-linear PDE’s. We approached this problem by allowing the user to specify 
a large number of matrix- and vector- valued coefficients. The solved equation 
has a general form 



C 



d U{x,t) 
dt 



d 


(p^( 


dxi 


V 




d f r. 


+ 


— — 1 Pa 


to 



dxi \ dx2 



— + 



d 

dX2 



r. 9U , ; 



+ P 7 U (a?, t) -f- E, 



^)) + ^ *)) + ^ t)) 



(30) 



where U{x,t) — (Ui{x,t), . . . ,UN^^{x,t)) is the solution vector and Ui £ 
for t G J, 2 = l,...A^eq- The user is requested to prescribe seven 
A^eq X ^eq matrices 



Pi = Pi i = 

together with a diagonal matrix 

^ .. du du du du 

C . diag (^C, (P, — , — , X, t) C„.,(P — , — , X, t) 

and a source term vector 



^ ^ dU dU , ^ ,,, dU dU 



Remark 2. In (30), we apply djdx\ and djdx 2 to vectors. By this notation we 
mean that these operators are applied to each component of the vector. 

The vector-valued initial condition has the standard form I7i(aj,0) = U^{x)^ 
2 = 1,..., A^eq- To each solution component 2 = 1, . . . , A^eq, we can prescribe 
either Dirichlet or Neumann boundary conditions. The Dirichlet boundary 
conditions have the form 



Ui{x,t) = (x,t), X£dQ^ 2 = l,...Afeq. (31) 

The Neumann boundary conditions are prescribed in the form 

where n = (ni, 222 ) is the outer unit normal to the boundary. One can prescribe 
various types of boundary conditions on different parts of the boundary. 
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6.2 Spatial and temporal discretization 

The solver distributes the polynomial orders p{Ki) in the mesh Th,p accordingly 
to user’s input and uses the above mentioned minimum rule to decide about 
local polynomial orders associated with edges. Partially novel Lobatto-based 
hierarchic shape functions (see [8]) up to the 10th order were used in order to 
obtain finite elements with excellent conditioning properties. For the sake of 
efficiency, economical Gauss quadrature points and weights from the CD-ROM 
[8] up to the 20th polynomial order were utilized. 

The spatial semidiscretization yields a system of nonlinear ordinary differ- 
ential equations (ODE’s). The user can choose either to use the built-in implicit 
Euler scheme which is only first-order accurate, or to utilize absolutely sta- 
ble implicit higher-order adaptive ODEPACK subroutines. These subroutines 
are very sophisticated and one of their significant advantages is that they are 
based on explicit evaluation of the right-hand side of the ODE system - thus, 
no linearization whatsoever is needed by the user. The solvers are capable of 
numerically obtaining information about the Jacobi matrix of the right-hand 
side that is needed for the backward differentiation formula (BDF) on which 
they are based. 



6.3 Outlook 

The combination of /ip-adaptive higher-order FE discretization in space with 
the adaptive higher-order time-integration methods of ODEPACK offers excit- 
ing perspectives for the numerical solution of large evolutionary problems. We 
hope to report on progress on the implementation of automatic hp-adaptive 
strategies in our 2D solver soon. 
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